View topic - Core dump issue

Core dump issue

General Help about QNX

Postby rnyter » Tue Sep 01, 2009 7:33 pm

mario wrote:Even if locking failed for some reason I wouldn't expect it to crash, fprintf would return -1 or something like that.

The problem may occur rarely because it depends no what the GPS returns.

Yes the textfile is using fprintf and as such is buffered try putting a fflush() afterthe fprintf or setit to non buffered with setvbuf. I think unlike the console a /n doesn't cause data to be flushed to a file.

If you are using 6.4.1 you could try using mudflap ( check the doc ).


OK, thanks. I will try using it to see if it would find out the corruption.
rnyter
Active Member
 
Posts: 24
Joined: Fri Jun 26, 2009 9:27 pm

Postby rnyter » Tue Sep 01, 2009 7:51 pm

Tim wrote:rnyter,

Did the fwrite help?

Also when the crash occurs and you look at the core dump when you walk back up the stack what does rbuffer and rbuffer_crc contain? Knowing that would go a long way toward determining what exactly is happening in terms of corruption (which I agree with Mario must be happening but isn't entirely clear to me either where it does).

Late note here:

I just looked at readcond in the help doc's.

The case where the last 3 arguments are (0,0,t) is marked as RESERVED. Thus I assume this means an indeterminate behavior. Your main readcond uses that exact argument format:

readcond(fd_ser, rbuffer, sizeof(rbuffer), 0, 0, 10);

Maybe it's as simple as changing that is all you need.

Tim


Hi, Tim.

Like I said, I don't know when the crush will happen or not. Sometimes it happens once or twice but other times it never happens. So we will need some time to test it. I also will try checking the memory in rbuffer and rbuffer_crc.

Thanks for your notice about the RESERVED argument format. I really didn't noticed that. What I used at the beginning is (sizeof(rbuffer), 0, 10) and I think I'd better change it back.

Rnyter
rnyter
Active Member
 
Posts: 24
Joined: Fri Jun 26, 2009 9:27 pm

Postby Tim » Tue Sep 01, 2009 8:14 pm

rnyter,

One other thing you might want to change:

In your read of data

readcond(fd_ser, rbuffer, sizeof(rbuffer), 0, 0, 10);

You ask for up to the full buffer size (300). If for any reason readcond had this much data you'd get 300 bytes worth. Then the strcat that occurs 2 lines later that adds the crc would definitely overflow the 300 byte buffer. To be *really* safe, either you should read 287 bytes or have another buffer that is bigger than rbuffer + rbuffer_crc + string terminator character.

Tim
Tim
Senior Member
 
Posts: 1511
Joined: Wed Mar 10, 2004 12:28 am

Postby mario » Tue Sep 01, 2009 8:58 pm

Nice catch Tim!


size = readcond(fd_ser, rbuffer, sizeof(rbuffer), ...);
... check and handle error
totalsize = readconf( fd_ser, &rbuffer[size], sizeof( rbuffer) - size, ... );
... check and handle error
rbuffer[totalsize] = 0; //null terminat to turn into a C string.

I don't like to rely on terminator or timeout. What I usually do when handling serial port is set it in unblock mode and use io_notify to get a pulse when there is something in the rx buffer. When the pulse is received the code go and get what ever is in the rx buffer. Then I have a state machine that
handles the data. One could get rid of the io_notify overhead by doing having a thread do a readconf with a minmum of 1 byte but a maximum of 1024.
mario
QNX Master
 
Posts: 4132
Joined: Sun Sep 01, 2002 1:04 am

Postby rnyter » Tue Sep 01, 2009 10:00 pm

Thanks to all your suggestions I've think I solved the problem now.

The program rarely crushes when running on the computer and diplaying the received data directly onto the moniter.

However it happened almost every time when I use phindows on another computer over a tweisted wire.

What I see is that on the phindows, it display the data row by row normally in a small console window. But when I maximize the console window, the displaying slows donw quickly and it cannot even display the data row by row. It displays segment by segment. And soon followed by a memory fault and core dumped.

I checked the data on the screen and in the textfile. Both of them have some corrupted data. It is this corrupted data and the strcat() causing the problem like you said.

To justify that, I change the serial port into unraw mode and use read(fd, rbuffer, sizeof(rbuffer)) so I don't have to use the strcat(). By the way, each row of the data comes from the GPS will be ended by a \r\n which can be considered to be a whole row.

Under the same conditions, though the displaying is also slow and there is also corrupted data both on the screen and in the textfile when I maximize the console winodw on phindows. But after I ran the program for a while I didn't meet the memory fault again.

Of course I have to do more tests to see if it really fix the problem.

To Tim, I tried the fwrite(), the same problem happened.

Thanks to you guys again!

Rnyter
rnyter
Active Member
 
Posts: 24
Joined: Fri Jun 26, 2009 9:27 pm

Postby rnyter » Tue Sep 01, 2009 10:14 pm

Tim wrote:rnyter,

One other thing you might want to change:

In your read of data

readcond(fd_ser, rbuffer, sizeof(rbuffer), 0, 0, 10);

You ask for up to the full buffer size (300). If for any reason readcond had this much data you'd get 300 bytes worth. Then the strcat that occurs 2 lines later that adds the crc would definitely overflow the 300 byte buffer. To be *really* safe, either you should read 287 bytes or have another buffer that is bigger than rbuffer + rbuffer_crc + string terminator character.

Tim


Though I do not use readcond() now I think your analysis is bright.

I thought the buffer size 300 should be enough for the data before because I know that the max size of the data will not exceed 200. But now I find that sometimes it did make some mistakes though I am not sure whether it overflew or something else happened.

I should consider more when I use the readcond() next time.

Thank you Tim!
rnyter
Active Member
 
Posts: 24
Joined: Fri Jun 26, 2009 9:27 pm

Postby rnyter » Tue Sep 01, 2009 10:27 pm

mario wrote:Nice catch Tim!


size = readcond(fd_ser, rbuffer, sizeof(rbuffer), ...);
... check and handle error
totalsize = readconf( fd_ser, &rbuffer[size], sizeof( rbuffer) - size, ... );
... check and handle error
rbuffer[totalsize] = 0; //null terminat to turn into a C string.

I don't like to rely on terminator or timeout. What I usually do when handling serial port is set it in unblock mode and use io_notify to get a pulse when there is something in the rx buffer. When the pulse is received the code go and get what ever is in the rx buffer. Then I have a state machine that
handles the data. One could get rid of the io_notify overhead by doing having a thread do a readconf with a minmum of 1 byte but a maximum of 1024.


I didn't use io_notify because I have to read the GPS fast (100hz) and I don't know whether it will slow down the program using io_notify.

One more problem, when I record the data in the textfile, I lost some of the data. For example, the GPS is sending data at 100Hz rate which means there will be 100 rows per second. However usually I only got 93 rows or less of data in the textfile. Maybe it is still not fast enough when reading the data or maybe it is something about the serial port buffer?
rnyter
Active Member
 
Posts: 24
Joined: Fri Jun 26, 2009 9:27 pm

Postby mario » Wed Sep 02, 2009 3:44 am

The serial driver has some buffer, but if your program is busy like when going a printf on a slow device (VGA) or writting to a slow device (hd), the real time behavior could be affected.

io_notify at say 19200 is not a problem at all.

You could increase devc-ser8250`s buffer with -I and activate the chips hardware FIFO with -t.

You have to make your code more resilient.
mario
QNX Master
 
Posts: 4132
Joined: Sun Sep 01, 2002 1:04 am

Previous

Return to Help

Who is online

Users browsing this forum: No registered users and 3 guests

cron