Jump to content
The Dark Mod Forums

Bogus Timestamps In TDM Update Files Cause TDM Crashes Under Linux


NightStalker

Recommended Posts

Regarding these bad timestamps, there is definetly a bug open already for about a year (?).

As far as I recall it was scheduled to be implemented in 2.05, and basicly should make the tdm_update correct all bad timestamps.

Can't point you to the bug-tacker right now though, because the Bugtracker is 'currently down for maintenance'.

Link to comment
Share on other sites

OK, many thanks for the comments, folks, especially to those pointing me to the BugTracker. I (foolishly) didn't even think to check it with regard to this issue, but in a preliminary check just now, I can see that there's some important, relevant history there.

I'm a bit dismayed at how much time I had to spend to find what is, essentially, a long-standing, still-outstanding issue that was recently discussed, but it's "water under the bridge" now, I guess. In the future, I'll be sure to peruse the BugTracker, you can bet!

I've PMed taaaki asking him to read and comment here.

Until he comments and until I've had a chance to fully read up on the history of this issue, I'll probably be mostly silent, since actually fixing this issue is mostly out of my hands now.

Is this something you could attempt..? and could we have x86 & x64 binaries side by side..?

"Could"? Yes. "Will"? Sorry, no. I simply have too many other projects going and, for the last month, I've been neglecting them in favor of TDM and that cannot continue. I may have inputs and offer occasional assistance if someone spearheads the drive to 64-bit, but that's the best I can offer ATM. Sorry. :(

Whatever happened to 'NagaHuntress'? It seemed like she was well along the way to converting TDM to a 64-bit app and I was impressed with her contributions.

  • Like 1
Link to comment
Share on other sites

NightStalker:

 

As you probably gleaned from the bug tracker issue, the changes to the updater were implemented as a workaround. This was because the only way to have the packager build "correct" pk4s was to force a situation where the afflicted files would have to be regenerated completely due to an "update" to an existing file and we couldn't ensure this for all the afflicted files. As I recall, this was implemented as an 11th hour kind of thing, so I admit that I probably didn't test it as well as I should have (and I don't think I got around to testing it on Linux at all) - hence the lack of issues on Windows.

 

In theory, it shouldn't matter that the tdm_update_* files have bad dates as the cleanup step in the updater should sort this out. But it is quite likely that I may have missed something in relation to the differential updates, which would not surprise me given how difficult it is to follow the updater code (as you can attest). I'll have to try and fix my Linux environment and run some some tests on there to see if I can isolate why the cleanups didn't work (in particular, I'll need to see why the differential updates still have this issue since the packager was "fixed" prior to the release of 2.04). And I agree that this is treating the symptoms, but unless we can actually track down and properly resolve the issue in the underlying zip library (or switch to a more modern version or alternative library), I'm not sure how to finally resolve this.

 

I assume that you also checked this bug for background: http://bugs.thedarkmod.com/view.php?id=4110

In that issue, you can see that neither gnartsch, nor myself could figure out why the dates were getting our of whack in the zip/pk4 files in the first place, but my suspicion was that the minizip library used by Doom3 is at fault.

 

I think it might be possible to regenerate older releases, but I'll have to make sure that the packager is actually buillding valid differential update zip files. Anyway, I'll take another look at this during the week and see what I can come up with taking your suggestions into account.

I am the bat. The night is mine.

Link to comment
Share on other sites

Taaaki,

Many thanks for your input on this! I actually have not had time to go back and read all the relevant stuff in the BugTracker, but I will accelerate that plan now that you've commented.

I would very much appreciate you fixing up your Linux environment because the input from someone much closer to the matter than myself would be helpful.

So, IIUC, there is no issue with the timestamps on the actual "TDM resource" files but they somehow get corrupted only when they're packaged into PK4 files?

I'm starting to think that your reference to "the packager" is the 'tdm_update/tdm_package' tree in SVN. I didn't have time to dig into that when I was researching these Linux crashes, but it sounds like that may be at the heart of the problem, possibly combined with the 'minizip' code.

I, too, was suspicious of the 'minizip' code in the Doom3/TDM project, but never actually took any time to dig into it -- I was simply overwhelmed by all the other things, mired in deep confusion and unfamiliarity.

I've got some new things to think about and look into now....

Thanks again for your inputs -- much appreciated!

  • Like 1
Link to comment
Share on other sites

Good news... I believe I've finally gotten to the bottom of what is corrupting the timestamps on the TDM update zip files ('tdm_update_2.0{0-4}_to_2.0{1-5}.zip'). As is often the case, it's a chain of unfortunate events, in this case triggered by some sloppy coding.

TLDR:

There are calls to 'mktime()' in 3 spots in the TDM code where the Daylight Saving Time (DST) input is not properly set up, essentially causing random cases where the returned time value is off by 1 hour.

By itself, that would be relatively harmless. But in the "right" circumstances, coupled with another factor (explained below) it's dangerous.

One of these 3 cases is in the TDM "packager" code, which can result in bad timestamps being generated in the zip files distributed for "incremental" TDM updates. This in turn corrupts several TDM resource files used by the game itself, causing crashes for Linux users when running certain missions which use certain resource files afflicted with bad timestamps, as explained in my 1st post of this thread.

BTW, none of these cases of inadequate 'mktime()' setup can be blamed on the Doom3 source code. These are all errors introduced by TDM folks. Moral of the story: Please be careful with your inputs, people!

Backstory:

I was able to duplicate the problem reasonably quickly, by running a customized version of 'tdm_package' with a customized PK4 file full of game resource files.

By simply altering one of the TDM resource files so that it had a timestamp of 01 Jan 1980 00:00:00, I was able to generate a TDM update zip file that had the bogus, illegal (all bits set) timestamp on that custom-dated file.

The bug in the 'mktime()' setup mentioned above caused the time to sometimes jump back by 1 hour, which effectively changed the timestamp on my "test" file to 31 Dec 1979 23:00:00. The 1-hour alteration is quite unpredictable, since the 'is DST' flag is not initialized before the call and contains, essentially, garbage. Therefore, sometimes the garbage causes 'mktime()' to think that DST is in effect and other times not. So, running 'tdm_package' twice in a row on identical input can result in files with different timestamps!

But there is another element to this problem....

The TDM "updater" app and the TDM "packager" app both use a 'minizip' source code "library" of sorts to do various zip/unzip operations. It contains questionable code in some spots. It expects the caller-supplied "year" value will be 1980-2107 (an absolute year, with century) or 80-207 (an absolute year, without century) or simply 0-127 (relative to 1980), subtracting either 1980, 80, or nothing to make the range 0-127 (and thereby legal for a zip file's 7-bit year specification). It simply was not written to tolerate a date with any other year. So our "1979" case (from the bad 'mktime()' call setup) causes it to erroneously subtract 80 years, resulting in a year of 1899! When this value is used to compute a DOS/ZIP-style timestamp, expected to be using a year in the range of 0-127, it computes a timestamp that exceeds the 4-byte (32-bit) allotment. Rather unfortunately, this further triggers some curious code in the 'minizip' library with the terse comment "data overflow - hack for ZIP64". This in turn causes 0xFF to be written to all 4 bytes of the date/time area of the zip file!

From there, it was easy to see what effect this had on the timestamps. That is, the timestamps had every bit in the 32-bit value set, which computes to a particular (grossly illegal!) timestamp of "15-31-2107 31:63:62", exactly as described in my 1st post.

The mystery was finally solved!

Solution:

Modify TDM code to set the 'tm_isdst' (DST) flag properly in every case where 'mktime()' is called. Suggested setting would be -1, which causes 'mktime()' to figure out the appropriate value.

This won't eliminate the odd "data overflow" code in the 'minizip' library, but it should prevent it from being invoked. That is, if we don't pass the 'minizip' library any dates with an effective year below 1980 (or beyond the 2107 [1980+127] limit of the zip file format), then all should be well.

The real problem now is dealing with all of the bad resource files (and to some extent, bad TDM upgrade zip files) that are already "out in the wild". This is the same issue as I mentioned in my 1st post, so I won't repeat it here, but it will need to be dealt with somehow.

Other Thoughts:

Without having seen any of the timestamps on the original resource files used to create these 'TDM upgrade' zip files, I cannot be 100% certain that this fix will solve all the bad timestamps. But I suspect that it's a major source of problems, and quite possibly the only source.

This "mktime() DST setup" bug is in 2 other spots in the TDM source code, so it may be affecting something else entirely. These 2 cases will need to be investigated further at some point.

I need to give it more thought, but when the fix described above has been in place for a while and has been fully verified, there really should be no need for the 'tdm_update' utility to be checking and altering timestamps in the PK4 "child" files any longer. That was/is essentially just "treating the symptoms", I believe, and I think we now know the "cure for the disease" (I hope!). So, at some point, we should give some thought to actually removing the code in the 'tdm_update' utility that tweaks timestamps.

The question arose earlier: Why didn't the 'tdm_update' app fix the broken timestamps in the PK4 child files given that it's specifically designed to do so? Without looking into it, I'm not sure but I'll hazard a guess that those broken dates may have triggered the very same "data overflow - hack for ZIP64" code in the 'minizip' library. Or something similar.

If I don't hear from taaaki (who I've PMed) reasonably soon (2 days?), my plan is to implement the simple fix and commit it to SVN, unless I hear otherwise.

As always, if I've left anything out, please let me know.

  • Like 4
Link to comment
Share on other sites

Nice find indeed. I have no issue with you committing the mktime() related changes. Once you do, I'll just need to know so that I can recompile the packager binary on the server (have to do it there since the server runs FreeBSD) and recommit the bin to SVN. I can then investigate rebuilding the differential updates and syncing to the mirrors.

 

We might still need some process for fixing a user's existing install (for cases where there are bad PK4s from previous updates). So shouldn't we leave the timestamp altering stuff in the updater for a release or two (after fixing it for Linux)? Or do you see another way, like fixing the file reading code in TDM itself so that it doesn't get tripped up by the bad dates?

I am the bat. The night is mine.

Link to comment
Share on other sites

I have no issue with you committing the mktime() related changes.

Ok, I will try to get the update done sometime in the next 24 hours.

Once you do, I'll just need to know so that I can recompile the packager binary on the server [...]

Wilco. Will PM you when I'm ready.

We might still need some process for fixing a user's existing install (for cases where there are bad PK4s from previous updates). So shouldn't we leave the timestamp altering stuff in the updater for a release or two (after fixing it for Linux)? Or do you see another way, like fixing the file reading code in TDM itself so that it doesn't get tripped up by the bad dates?

I don't have any objection to leaving the timestamp check/tweak in place for a while, but since it didn't work (in my case, anyway), I think someone needs to investigate why not. I suppose I can do that, but, again, I'm both lousy at C++ and really hate it!

 

As for "another way", it's certainly possible, but despite initially digging deeply into the non-primary cause (i.e. what turned out to be just "the symptoms"), I still have no strong understanding of why the resources are time-checked, let alone the confidence to start modifying that C++ code! :o

 

Give me some time to think about this last quoted bit of yours some more. Will comment more when I can.

 

Thanks for your input, taaaki!

Link to comment
Share on other sites

  • 1 year later...

Could you please explain one thing for those late to party: why the hell are timestamps important?

 

In my opinion, I can randomize timestamps of all pk4 files to everything ranging from 2000 B.C. to year 10000.

And TDM should still work perfectly after that.

Link to comment
Share on other sites

Could you please explain one thing for those late to party: why the hell are timestamps important?

I explained in a general way in the 1st post of this thread:

It's not intuitive at all (to me), but the processing of some of these TDM "resources" is based on timestamps.

 

More specifically, getting that bogus timestamp from a file within a PK4 file results in the low-level 'ReadFile()' routine returning a timestamp of -1, which screws things up further down the line in very subtle ways that ultimately manifests itself as an unprotected attempt to write to a null pointer, thereby crashing the game!

Refreshing my memory by re-reading my notes on this from Dec 2016, I'll point you to the 'R_ParseImageProgram_r()' routine in the 'renderer/Image_program.cpp' file. Beyond that, I'll leave it to the author of that code to explain. ;)

Link to comment
Share on other sites

I explained in a general way in the 1st post of this thread:

Refreshing my memory by re-reading my notes on this from Dec 2016, I'll point you to the 'R_ParseImageProgram_r()' routine in the 'renderer/Image_program.cpp' file. Beyond that, I'll leave it to the author of that code to explain. ;)

I think timestamps are used to make reloadImages work (only images with newer timestamps are loaded --- otherwise it would be too long).

Wouldn't it be OK for players if we ensure that timestamp = -1 is never returned from file operations when file exists?

Something like if (timestamp == -1) timestamp = -2 ?

Link to comment
Share on other sites

The timestamps of the files in the pk4's are also used by the resource loader to determine whether a file is found or not.

In such way that a timestamp=-1 was interpreted as 'file doesn't exist' and the resource would not be loaded at all.

Anyway: these issues should be fixed since TDM 2.05 already.

Link to comment
Share on other sites

Wouldn't it be OK for players if we ensure that timestamp = -1 is never returned from file operations when file exists?

I agree with 'gnartsch'. I don't think there's any need to try to fix anything in this regard.

 

The packager app had the bug originally. Unfortunately, it went undetected and a "band-aid" fix (which itself had a flaw) was applied to the updater app. Shortly after I became involved with TDM, I eventually found and fixed both flaws (back in late Dec 2016). IMHO, now that we've been through 2 TDM releases with the fixes in place and no reports of problems, we should be able to remove (at any convenient time) the entire "band-aid" (i.e. the entire block of code that does anything to the timestamps) in the updater app. Since the packager app (the true source of the problem) has been fixed, there should be no problems going forward, I believe.

Link to comment
Share on other sites

The packager app had the bug originally. Unfortunately, it went undetected and a "band-aid" fix (which itself had a flaw) was applied to the updater app. Shortly after I became involved with TDM, I eventually found and fixed both flaws (back in late Dec 2016). IMHO, now that we've been through 2 TDM releases with the fixes in place and no reports of problems, we should be able to remove (at any convenient time) the entire "band-aid" (i.e. the entire block of code that does anything to the timestamps) in the updater app. Since the packager app (the true source of the problem) has been fixed, there should be no problems going forward, I believe.

Created issue 4864 for removing this hotfix. Planned for 2.08, where tdm_update will get big cleanup anyway.

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recent Status Updates

    • Petike the Taffer

      I've finally managed to log in to The Dark Mod Wiki. I'm back in the saddle and before the holidays start in full, I'll be adding a few new FM articles and doing other updates. Written in Stone is already done.
      · 0 replies
    • nbohr1more

      TDM 15th Anniversary Contest is now active! Please declare your participation: https://forums.thedarkmod.com/index.php?/topic/22413-the-dark-mod-15th-anniversary-contest-entry-thread/
       
      · 0 replies
    • JackFarmer

      @TheUnbeholden
      You cannot receive PMs. Could you please be so kind and check your mailbox if it is full (or maybe you switched off the function)?
      · 1 reply
    • OrbWeaver

      I like the new frob highlight but it would nice if it was less "flickery" while moving over objects (especially barred metal doors).
      · 4 replies
    • nbohr1more

      Please vote in the 15th Anniversary Contest Theme Poll
       
      · 0 replies
×
×
  • Create New...