Jump to content
The Dark Mod Forums

Mission downloader issue


esme
 Share

Recommended Posts

I must admit I'm rather nooby in Linux development 😥

I looked into the dump, but could not get meaningful stack traces. It seems that TDM is killed by SIGABRT, and it seems that core dump contains state after handling the signal. This is quite strange because manuals say that gdb breaks on it  by default.

Maybe you could try again, but please execute "handle SIGABRT stop" in gdb before executing "run" ?


Another approach is to get stack trace on your machine.
Download debug symbols: get and unpack thedarkmod.x64.debug, put it just near thedarkmod.x64.
Then run gdb ./thedarkmod.x64: you should see a message like this:

Reading symbols from /mnt/hgfs/thedarkmod/darkmod_209/thedarkmod.x64.debug...done.

Then execute "run" and reproduce the crash.
After then execute "bt" in gdb: it should print meaningful stack trace.
You can also switch between threads using "info threads" and "thread 1" / "thread 2" / ..., but I hope it won't be necessary.

Link to comment
Share on other sites

OK I shall give that a go, meanwhile I've had a thought, you've probably discounted this already but based on what I'm seeing with the crashes I suspect it's buffer overrun type error

When selecting missions to download TDM copies the entry from the downloaded list & builds a separate list of missions to download

Then works it's way through this second list removing the first entry after each download

It's this second list that's corrupted somehow

I've seen errors that mention "doubly linked list", so at a guess in each list entry there's a pointer back & a pointer forward

As a minimum there'll also be a string for the description that's displayed another string telling TDM where to download the file from

So 2 strings & 2 pointers per entry as a minimum

What if, when this entry is created the size of the download address string is used to reserve space for the description string, or vice versa, it's a remarkably easy thing to get wrong

If the description is shorter than the address then all will be fine, if it's longer, then the description will overrun the space allocated and if the pointers holding the list together comes after that they'll be overwritten

Just a thought, it's been a while since I did any serious programming

Link to comment
Share on other sites

OK, I downloaded the debug symbols & dropped them in my darkmod folder, ran TDM inside gdb as before

gdb read the symbols

Quote

Reading symbols from ./thedarkmod.x64...Reading symbols from /home/karen/darkmod/thedarkmod.x64.debug...done.

did a handle SIGABRT stop

Quote

(gdb) handle SIGABRT stop
Signal        Stop    Print    Pass to program    Description
SIGABRT       Yes    Yes    Yes        Aborted

And then ran tdm

I had a mission installed so I uninstalled that first, with no apparent problems

I selected 4 missions to download starting with "Alberics curse" and got the first 3 before the system stopped, all 3 pk4 files open without error

I alt tabbed back to the console window

I tried to generate a core dump but got the message

Quote

warning: Memory read failed for corefile section, 4096 bytes at 0xffffffffff600000.
Saved corefile core.6487

So I'm not sure what state the core file is in, I sipped & uploaded it anyway

I also executed a "bt" & got the following

Quote

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff6d21875 in __GI_abort () at abort.c:79
#2  0x00007ffff6d88856 in __libc_message (action=action@entry=do_abort,
    fmt=fmt@entry=0x7ffff6eac560 "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007ffff6d8f6ea in malloc_printerr (
    str=str@entry=0x7ffff6eae200 "free(): invalid next size (fast)")
    at malloc.c:5336
#4  0x00007ffff6d915e6 in _int_free (av=0x7fffac000020, p=0x7fffac000c00,
    have_lock=<optimised out>) at malloc.c:4199
#5  0x000000000073c545 in std::string::_Rep::_M_dispose (__a=...,
    this=<optimised out>) at /usr/include/c++/5/bits/basic_string.h:2646
#6  std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string (this=0x7fffac007210, __in_chrg=<optimised out>)
    at /usr/include/c++/5/bits/basic_string.h:2943
#7  std::experimental::filesystem::v1::path::~path (
    this=this@entry=0x7fffac007210, __in_chrg=<optimised out>)
    at /usr/include/c++/5/experimental/fs_path.h:190
#8  0x0000000000735224 in std::experimental::filesystem::v1::path::_Cmpt::~_Cmpt (this=0x7fffac007210, __in_chrg=<optimised out>)
    at /usr/include/c++/5/experimental/fs_path.h:568
#9  std::_Destroy<std::experimental::filesystem::v1::path::_Cmpt> (
    __pointer=<optimised out>) at /usr/include/c++/5/bits/stl_construct.h:93
#10 std::_Destroy_aux<false>::__destroy<std::experimental::filesystem::v1::path:---Type <return> to continue, or q <return> to quit---
:_Cmpt*> (__last=<optimised out>, __first=0x7fffac007210)
    at /usr/include/c++/5/bits/stl_construct.h:103
#11 std::_Destroy<std::experimental::filesystem::v1::path::_Cmpt*> (
    __last=<optimised out>, __first=<optimised out>)
    at /usr/include/c++/5/bits/stl_construct.h:126
#12 std::_Destroy<std::experimental::filesystem::v1::path::_Cmpt*, std::experimental::filesystem::v1::path::_Cmpt> (__last=0x7fffac007240,
    __first=<optimised out>) at /usr/include/c++/5/bits/stl_construct.h:151
#13 std::vector<std::experimental::filesystem::v1::path::_Cmpt, std::allocator<std::experimental::filesystem::v1::path::_Cmpt> >::~vector (
    this=0x7fffac00f1f8, __in_chrg=<optimised out>)
    at /usr/include/c++/5/bits/stl_vector.h:424
#14 std::experimental::filesystem::v1::path::~path (this=0x7fffac00f1f0,
    __in_chrg=<optimised out>) at /usr/include/c++/5/experimental/fs_path.h:190
#15 stdext::path_impl::~path_impl (this=0x7fffac00f1f0,
    __in_chrg=<optimised out>)
    at /mnt/hgfs/thedarkmod/darkmod_src/idlib/StdFilesystem.cpp:41
#16 std::default_delete<stdext::path_impl>::operator() (this=<optimised out>,
    __ptr=0x7fffac00f1f0) at /usr/include/c++/5/bits/unique_ptr.h:76
#17 std::unique_ptr<stdext::path_impl, std::default_delete<stdext::path_impl> >::~unique_ptr (this=<optimised out>, __in_chrg=<optimised out>)
    at /usr/include/c++/5/bits/unique_ptr.h:236
#18 stdext::path::~path (this=<optimised out>, __in_chrg=<optimised out>)
---Type <return> to continue, or q <return> to quit---
    at /mnt/hgfs/thedarkmod/darkmod_src/idlib/StdFilesystem.cpp:54
#19 0x00000000008e2b6a in CDownload::Perform (this=0x7fffb405f1f0)
    at /mnt/hgfs/thedarkmod/darkmod_src/game/Missions/Download.cpp:175
#20 0x000000000117e790 in execute_native_thread_routine ()
#21 0x00007ffff7bbbe64 in start_thread (arg=<optimised out>)
    at pthread_create.c:486
#22 0x00007ffff6e1a12f in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Then I quit & killed the process

Full terminal log is here

Core dump (zipped) is here

  • Thanks 1
Link to comment
Share on other sites

I see the same trash again.

It turns out that in the damn Linux world, you cannot open core dump made on a machine with different glibc version if the program is multithreaded! At least that's how I understand the warning "Unable to find libthread_db matching inferior's thread library, thread debugging will not be available." I don't know how to fix that... to me it makes core dumps rather useless feature.

I'm glad you have posted a stacktrace. At least I know where to look at.

Link to comment
Share on other sites

I am so sorry for causing all this grief and I am amazed that debugging on a different Linux system is so difficult, it's not the Linux system you're debugging after all

Edited by esme
too much twitter
Link to comment
Share on other sites

  

3 hours ago, esme said:

I am so sorry for causing all this grief

I think you suffered a lot more than me to get to this point 🤔
I'm disappointed if this comes to no end after so much effort.

Quote

and I am amazed that debugging on a different Linux system is so difficult, it's not the Linux system you're debugging after all

Yes, I'm surprised too.

Do you know by any chance if your glibc was compiled with -fomit-frame-pointers ?
One possible explanation is that your glibc build has no frame pointers (maybe all builds are done like this). I don't have debug symbols for it, so my gdb cannot push through glibc functions when doing stack trace (note that the crash happens inside free). You have debug information (I suppose it was installed with some of the packages, e.g. with gdb) of glibc, enough to make stack trace push through glibc functions and see TDM call stack.

Anyway, I don't think it makes sense to fight with this mess.
It's time to either forget about this problem until better circumstances, or try debugging over Skype or something like that (which would probably be lengthy and end nowhere too).

  • Like 1
Link to comment
Share on other sites

19 hours ago, stgatilov said:

Do you know by any chance if your glibc was compiled with -fomit-frame-pointers ?

I'm sorry I have no idea, I wouldn't even know where to look, I'm guessing there's a make file somewhere but I don't know where

If you want to try something like inserting snapshots into the download code & dumping information to the console I'll happily run that & send you the results, but unless you have a pretty good idea where the problem is that's a bit hit & miss

I could have a look at the download code if you like, fresh eyes might see something

But I agree you've spent more than enough time on this, I can live with it as it only affects downloads & even then it's intermittent

Sorry this has taken so much of your time

Link to comment
Share on other sites

21 hours ago, duzenko said:

@esmeWhat about compiling TDM from source in debug mode on your PC?

I already gave debug symbols for the 2.09a release.
They can be used for debugging, although doing it is not very obvious (optimizations make things messy).

The source code for the latest release can be obtained from website: https://www.thedarkmod.com/sources/thedarkmod.2.09a.src.7z
There is COMPILING.txt file inside, which explains how to build. It is pretty standard CMake build, so I guess you can edit CMakeLists.txt to disable optimization.

P.S. Since I'll probably need to analyze other core dumps in future anyway, I'd be happy to try one more thing. @esme, could you please share your libc.so ?

Link to comment
Share on other sites

Now I can finally decipher the correct stack trace.

Here is the list of commands which was necessary to achieve that:

# note: all the .so files obtained from user machine must be put into local directory.
#
# most importantly, the following files are necessary:
#   1. libthread_db.so.1 and libpthread.so.0: required for thread debugging.
#   2. other .so files are required if they occur in call stack.
#
# these files must also be renamed exactly as the symlinks
# i.e. libpthread-2.28.so should be renamed to libpthread.so.0

# load executable file
file ./thedarkmod.x64

# force gdb to forget about local system!
# load all .so files using local directory as root
set sysroot .

# drop dump-recorded paths to .so files
# i.e. load ./libpthread.so.0 instead of ./lib/x86_64-linux-gnu/libpthread.so.0
set solib-search-path .
# disable damn security protection
set auto-load safe-path /

# load core dump file
core core.6487

# print stacktrace
bt

🤬

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

  • Recent Status Updates

    • irg

      Watching warmly for The Black Parade, The Broken Goddess and Blood Death Wish Ep.4. Sometimes the best things in life actually are free.
      · 0 replies
    • STiFU

      We are taking our son on his very first holiday trip to see the sea for the first time. 🙂 Will be back in a week.
      · 2 replies
    • Gilkar

      When I was a young man my father was so ignorant I could hardly stand to have him around. As I grew older I was amazed at how much the old man had learned in such a short time.
      · 2 replies
    • jaxa

      RTX 3090 Super, RTX 3070 Ti 16 GB, RTX 2060 12 GB
      https://wccftech.com/nvidia-launching-rtx-3090-super-rtx-3070-ti-16gb-and-rtx-2060-12gb-by-january-2022/
      · 0 replies
    • duzenko

      CPU benchmark time - compiling DarkRadiant (2nd run)
      i5 8600K 6C/6T@4.4GHz DDR4 2x2133MHz 9MB cache
      Parallel builds: 1. 3:57 Parallel builds: 6 (default). 2:28 r5 1600AF 6C/12T@3.3GHz DDR4 1x2666MHz 16 MB cache, temp folder on HDD
      Parallel builds: 1. 5:05 Parallel builds: 4. 2:47 Parallel builds: 6. 2:55 Parallel builds: 12 (default). 2:57
      · 6 replies
×
×
  • Create New...