View topic - Application hang without debug information
Application hang without debug information
23 posts
• Page 1 of 2 • 1, 2
Application hang without debug information
Dear all,
I recently had an application running on 6.5.0 with ARM A8 processor. I launched in debug mode in the IDE and the application starts running as expected. Then after around say 20min, or even an hour, the application just hanged without meeting any crash errors like SEGMENTATION fault indicated in the IDE. Meaning, I could not even debug the application in this case. I have checked the system memory with "pidin info" in my application and the memory usage should be correct. And when the application hang, the wireless connection also went down and I cannot even telnet the system. Then are there any possible methods I can use to debug my application?
My current guessing is that since the IDE cannot catch the crash line in the code, then is this meaning the application hang is not because of the code itself? If the application hanged because of self-lock, then Telnet should be able to work. I have disabled all the CPU consuming functions in the application as well. Is there a way to record down the cpu usage, memory usage and events in the system services or in the application program to identify the problem? Thanks very much.
Best regards,
Eric
I recently had an application running on 6.5.0 with ARM A8 processor. I launched in debug mode in the IDE and the application starts running as expected. Then after around say 20min, or even an hour, the application just hanged without meeting any crash errors like SEGMENTATION fault indicated in the IDE. Meaning, I could not even debug the application in this case. I have checked the system memory with "pidin info" in my application and the memory usage should be correct. And when the application hang, the wireless connection also went down and I cannot even telnet the system. Then are there any possible methods I can use to debug my application?
My current guessing is that since the IDE cannot catch the crash line in the code, then is this meaning the application hang is not because of the code itself? If the application hanged because of self-lock, then Telnet should be able to work. I have disabled all the CPU consuming functions in the application as well. Is there a way to record down the cpu usage, memory usage and events in the system services or in the application program to identify the problem? Thanks very much.
Best regards,
Eric
- Ericxx
- Senior Member
- Posts: 158
- Joined: Mon Jun 09, 2008 1:38 pm
Re: Application hang without debug information
Assuming you have some writable disk space, there is the kernel logger.
- maschoen
- QNX Master
- Posts: 2728
- Joined: Wed Jun 25, 2003 5:18 pm
Re: Application hang without debug information
It's possible that a high priority task (driver?) is running 'ready' and consuming 100% of the CPU. That would lock out the IDE and your application.
To test this theory you could try setting the priory of your telnet session / qconn higher than that of anything else running on your system including drivers.
Tim
To test this theory you could try setting the priory of your telnet session / qconn higher than that of anything else running on your system including drivers.
Tim
- Tim
- Senior Member
- Posts: 1514
- Joined: Wed Mar 10, 2004 12:28 am
Re: Application hang without debug information
Thanks maschoen and Tim.
Yes, I found my application has a thread priority of 10 which is the same as kernel shown in the "top" command. I reduced my application highest priority thread and now can telnet when the application hanged.
After quite a few times of online debugging, we found the place where the application hanged. However, when we checked the variable values in the hanged thread, both local variables and a global variable are corrupted with an unreasonable value and caused the hanging problem. Then we tried to add a hw watchpoint of the global variable to see where exactly it got changed during the application execution period. However, the watchpoint break point was not hit when the thread hanged.
We also tried the memory analysis toolkit with the "Memory problems" options and still cannot find the corrupted area in the code. Can please help suggest a way to debug this kind of issue? Many thanks.
Best regards,
Eric
Yes, I found my application has a thread priority of 10 which is the same as kernel shown in the "top" command. I reduced my application highest priority thread and now can telnet when the application hanged.
After quite a few times of online debugging, we found the place where the application hanged. However, when we checked the variable values in the hanged thread, both local variables and a global variable are corrupted with an unreasonable value and caused the hanging problem. Then we tried to add a hw watchpoint of the global variable to see where exactly it got changed during the application execution period. However, the watchpoint break point was not hit when the thread hanged.
We also tried the memory analysis toolkit with the "Memory problems" options and still cannot find the corrupted area in the code. Can please help suggest a way to debug this kind of issue? Many thanks.
Best regards,
Eric
- Ericxx
- Senior Member
- Posts: 158
- Joined: Mon Jun 09, 2008 1:38 pm
Re: Application hang without debug information
Eric,
I don't think that the hw watchpoints work in the way you are hoping for. At least in my experience with them I've never managed to use them to find the kind of problem you are describing. That's because the corruption is occurring elsewhere (a pointer running wild) and not from code statements modifying the variable. This 2nd paragraph of this gdb reference (which is what QNX is based on) talks about this. But maybe someone else has used the watchpoints successfully in a multi-threaded environment and enlighten us both.
https://sourceware.org/gdb/onlinedocs/g ... oints.html
Things I can suggest:
1) Turn the warning level to maximum and re-compile everything and make sure you fix every warning (uninitialized variables are a big culprit). You might want to start with '-Wall -Wuninitialized -O'. You can find gcc warning types here for hunting for array bounds and other things here:
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
2) If you are running running in debug mode, switch to release mode. This normally causes a much faster 'crash' that will help locate where the real problem is (run the dumper process so you get a dump file which will give you the crash address that you can then look up in a map file - turn on map file option when compiling). On the other hand if you are running in release mode, switch to debug mode.
3) Visually inspect your code base. Your looking for pointers or any C arrays/buffers (this is where C++ really helps over C if you are using C++ stdlib classes). One of those is definitely the problem (this can be especially tricky in multi-threaded code if pointers are shared between threads and the memory space or what's pointed to can change and you missed adding a mutex someplace...).
Tim
I don't think that the hw watchpoints work in the way you are hoping for. At least in my experience with them I've never managed to use them to find the kind of problem you are describing. That's because the corruption is occurring elsewhere (a pointer running wild) and not from code statements modifying the variable. This 2nd paragraph of this gdb reference (which is what QNX is based on) talks about this. But maybe someone else has used the watchpoints successfully in a multi-threaded environment and enlighten us both.
https://sourceware.org/gdb/onlinedocs/g ... oints.html
Things I can suggest:
1) Turn the warning level to maximum and re-compile everything and make sure you fix every warning (uninitialized variables are a big culprit). You might want to start with '-Wall -Wuninitialized -O'. You can find gcc warning types here for hunting for array bounds and other things here:
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
2) If you are running running in debug mode, switch to release mode. This normally causes a much faster 'crash' that will help locate where the real problem is (run the dumper process so you get a dump file which will give you the crash address that you can then look up in a map file - turn on map file option when compiling). On the other hand if you are running in release mode, switch to debug mode.
3) Visually inspect your code base. Your looking for pointers or any C arrays/buffers (this is where C++ really helps over C if you are using C++ stdlib classes). One of those is definitely the problem (this can be especially tricky in multi-threaded code if pointers are shared between threads and the memory space or what's pointed to can change and you missed adding a mutex someplace...).
Tim
- Tim
- Senior Member
- Posts: 1514
- Joined: Wed Mar 10, 2004 12:28 am
Re: Application hang without debug information
Dear Tim,
Many thanks for your detailed explanations. As you suggested, I fixed the uninitialised and unused variables from the compilation warnings. Then I ran the program in release mode for about 2 hours and did not happen the hanging problem. Previously, we did the tests with the debug mode instead of release mode. Can I say that the memory layout somehow changed after fixing the warnings and launch in release mode? I am not quite sure if I have already solved the memory corruption problem after fixing the warnings.
Now to locate the problem source, we put some printf functions in each thread if the global variable value get corrupted during the execution and still looking for the culprit. Hopefully can find out the exact problem.
Best regards,
Eric
Many thanks for your detailed explanations. As you suggested, I fixed the uninitialised and unused variables from the compilation warnings. Then I ran the program in release mode for about 2 hours and did not happen the hanging problem. Previously, we did the tests with the debug mode instead of release mode. Can I say that the memory layout somehow changed after fixing the warnings and launch in release mode? I am not quite sure if I have already solved the memory corruption problem after fixing the warnings.
Now to locate the problem source, we put some printf functions in each thread if the global variable value get corrupted during the execution and still looking for the culprit. Hopefully can find out the exact problem.
Best regards,
Eric
- Ericxx
- Senior Member
- Posts: 158
- Joined: Mon Jun 09, 2008 1:38 pm
Re: Application hang without debug information
Eric,
The memory layout definitely changes when you run in release mode. This mostly affects local (stack) variables as opposed to global variables. What happens is that the optimization in release mode turns many stack variables into register variables so they don't occupy stack memory (this is why you can't debug a release compilation because the variables don't occupy any traditional memory space).
If you ran for 2 hrs with no problems in release mode after fixing the warnings there is a good chance you fixed the problem. You can determine if that's the case by:
1) Going back to the original code before you fixed the warnings (you are using some kind of source code control right
) and compile it in release mode and see if the hangup occurs. If it does then you know you fixed the problem.
2) Run your current code with all the fixes in debug mode and see if the hangup occurs.
Your last paragraph indicates you are still looking for the culprit. Are you looking to determine exactly what you might have fixed because I thought it was now working (2+ hrs no hang)?
Tim
The memory layout definitely changes when you run in release mode. This mostly affects local (stack) variables as opposed to global variables. What happens is that the optimization in release mode turns many stack variables into register variables so they don't occupy stack memory (this is why you can't debug a release compilation because the variables don't occupy any traditional memory space).
If you ran for 2 hrs with no problems in release mode after fixing the warnings there is a good chance you fixed the problem. You can determine if that's the case by:
1) Going back to the original code before you fixed the warnings (you are using some kind of source code control right

2) Run your current code with all the fixes in debug mode and see if the hangup occurs.
Your last paragraph indicates you are still looking for the culprit. Are you looking to determine exactly what you might have fixed because I thought it was now working (2+ hrs no hang)?
Tim
- Tim
- Senior Member
- Posts: 1514
- Joined: Wed Mar 10, 2004 12:28 am
Re: Application hang without debug information
Hello Tim,
We have identified the problem and it was indeed an array memcpy mistake. It just takes quite a few time before we caught this bug. After fixing the bug, the application can run smoothly now. Btw, is there any effective memory analysis tool under QNX like the Valgrind thing? Thanks a lot for your kind help.
Best regards,
Eric
We have identified the problem and it was indeed an array memcpy mistake. It just takes quite a few time before we caught this bug. After fixing the bug, the application can run smoothly now. Btw, is there any effective memory analysis tool under QNX like the Valgrind thing? Thanks a lot for your kind help.
Best regards,
Eric
- Ericxx
- Senior Member
- Posts: 158
- Joined: Mon Jun 09, 2008 1:38 pm
Re: Application hang without debug information
Eric,
The IDE includes a memory analysis tool (note you need to link in a special library to over ride the normal one)
http://www.qnx.com/developers/docs/6.5. ... dures.html
which can detect buffer overflows like the one you experienced
http://www.qnx.com/developers/docs/6.5. ... flow_.html
However it only works on heap variables and not stack variables. So if you have a function like
void foo()
{
char bar[5];
memset(&bar, 0, 6 ); // overflow
}
it won't help you.
Tim
The IDE includes a memory analysis tool (note you need to link in a special library to over ride the normal one)
http://www.qnx.com/developers/docs/6.5. ... dures.html
which can detect buffer overflows like the one you experienced
http://www.qnx.com/developers/docs/6.5. ... flow_.html
However it only works on heap variables and not stack variables. So if you have a function like
void foo()
{
char bar[5];
memset(&bar, 0, 6 ); // overflow
}
it won't help you.
Tim
- Tim
- Senior Member
- Posts: 1514
- Joined: Wed Mar 10, 2004 12:28 am
Re: Application hang without debug information
I usualy use Mudflap to catch this kind of problems.
- nico04
- Senior Member
- Posts: 171
- Joined: Wed Sep 29, 2010 9:59 am
- Location: France
Re: Application hang without debug information
Thanks, I just started using the mudflap option now.
Just came across a strange problem, my original application can run normally without the mudflap option. After compiling with mudflap option, and launch the run configuration with mudflap, the application hangs with the message "Process 163857 (gumstix_app) terminated SIGSEGV code=1 fltno=11 ip=0105a138(libc.so.3@_Initlocks+0x80) mapaddr=0005a138. ref=00000004 " and cannot proceed with the mudflap analysis.
What might be the reasons for this? Thanks.
Eric
Just came across a strange problem, my original application can run normally without the mudflap option. After compiling with mudflap option, and launch the run configuration with mudflap, the application hangs with the message "Process 163857 (gumstix_app) terminated SIGSEGV code=1 fltno=11 ip=0105a138(libc.so.3@_Initlocks+0x80) mapaddr=0005a138. ref=00000004 " and cannot proceed with the mudflap analysis.
What might be the reasons for this? Thanks.
Eric
- Ericxx
- Senior Member
- Posts: 158
- Joined: Mon Jun 09, 2008 1:38 pm
Re: Application hang without debug information
Attach a debug screenshot when the SEGSEV happened.
- Attachments
-
mudflap_SEGSEGV_screenshot.png
- (18.1 KiB) Not downloaded yet
- Ericxx
- Senior Member
- Posts: 158
- Joined: Mon Jun 09, 2008 1:38 pm
Re: Application hang without debug information
http://www.qnx.com/developers/docs/6.5. ... nIDE_.html
Are you doing regular memory analysis too? The docs say it will cause the code to crash if you using it in conjunction with Mudflap.
KGB
Are you doing regular memory analysis too? The docs say it will cause the code to crash if you using it in conjunction with Mudflap.
KGB
- Tim
- Senior Member
- Posts: 1514
- Joined: Wed Mar 10, 2004 12:28 am
Re: Application hang without debug information
Thanks, Tim.
I am not doing the memory analysis tool with the mudflap. Guess it is the linker option (-f mudflapth -lmudflapth) missing reason. Will try and see.
Regards,
Eric
I am not doing the memory analysis tool with the mudflap. Guess it is the linker option (-f mudflapth -lmudflapth) missing reason. Will try and see.
Regards,
Eric
- Ericxx
- Senior Member
- Posts: 158
- Joined: Mon Jun 09, 2008 1:38 pm
Re: Application hang without debug information
As my application is multi-threaded, I tried with the link option -lmudflapth, still got the same result. Just wondering the option -fmudflapth should be for the compiler option instead of link option? My current application is compile with "-lmudflapth -fmudflapth", but still cannot get rid of the crash when launching the mudflap testing. Please kindly suggest.
Regards,
Eric
Regards,
Eric
- Ericxx
- Senior Member
- Posts: 158
- Joined: Mon Jun 09, 2008 1:38 pm
23 posts
• Page 1 of 2 • 1, 2
Who is online
Users browsing this forum: No registered users and 3 guests