View topic - Core dump issue

Core dump issue

General Help about QNX

Core dump issue

Postby rnyter » Sun Aug 30, 2009 9:43 pm

Hi, everyone.

I tried to write data into a text file at 100Hz per line. Sometimes there would be a Memory Fault and created a core dump. The core dump file implied that there was a error when calling the function _Lockfilelock().

Here's the code:

#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <termios.h>


int main( void )
{
int fd_ser;
FILE *fp_file;
char rbuffer[300]={""}; //store GPS data
char rbuffer_crc[12]={""}; //GPS CRC
char cbuffer[400]={""}; //store configuration data
char wbuffer[40]={"log inspvasa ontime 0.1\r"}; //command
struct termios termios_p; //the serrial port attribute

//////////////////////////////////////////////
//Open the data recording file
//////////////////////////////////////////////
if((fp_file = fopen("GPSdata.txt", "a+"))==NULL)
{
printf("open error.\n");
return EXIT_FAILURE;
}

///////////////////////////////////////
//Open the serial port
//////////////////////////////////////
fd_ser = open( "/dev/ser1", O_RDWR);

///////////////////////////////////////////////////
//Set the forwarding character ''
///////////////////////////////////////////////////
tcgetattr(fd_ser, &termios_p);
termios_p.c_cc[VFWD] = 0x00;
tcsetattr(fd_ser, TCSANOW, &termios_p);

///////////////////////////////////////
//Check satellite Omnistar
///////////////////////////////////////
printf("Checking Omnistar...\t\t");
fflush(stdout);
do
{
write(fd_ser, "log bestgpsposa once\r", sizeof("log bestgpaposa once\r"));
readcond(fd_ser, cbuffer, sizeof(cbuffer), 300, 0, 5); //Read the data
printf("%s\n",cbuffer);
}while(!strstr(cbuffer, "OMNISTAR"));
printf("******Connected!!******\n");
memset(cbuffer, 0, 400); //clear buffer

///////////////////////////////////////////////////
//Set the forwarding character '*'
///////////////////////////////////////////////////
tcgetattr(fd_ser, &termios_p);
termios_p.c_cc[VFWD] = '*';
tcsetattr(fd_ser, TCSANOW, &termios_p);

sleep(3);

///////////////////////////
//Send command
//////////////////////////
write(fd_ser, wbuffer, sizeof(wbuffer));

////////////////////////////////
// Read OK
////////////////////////////////
do
{
readcond(fd_ser, cbuffer, 13, 13, 0, 5); //Read 'OK'
}while(!strstr(cbuffer, "OK"));
memset(cbuffer, 0, 400); //clear buffer

//////////////////////
//Read GPS data
//////////////////////

while(1)
{

readcond(fd_ser, rbuffer, sizeof(rbuffer), 0, 0, 10);
readcond(fd_ser, rbuffer_crc, 10, 10, 0, 10);
strcat(rbuffer, rbuffer_crc);

////////////////////////////////////////////////
//Write directly into GPSdata.txt
////////////////////////////////////////////////
fprintf(fp_file, "%s", rbuffer);

printf("******Data from GPS******\n%s\n", rbuffer);
memset(rbuffer, 0, 300); //clear buffer
memset(rbuffer_crc, 0, 12);
}

//////////////////////////////////////////////////////
//Clear the forwarding character '*'
//////////////////////////////////////////////////////
tcgetattr(fd_ser, &termios_p);
termios_p.c_cc[VFWD] = 0x00;
tcsetattr(fd_ser, TCSANOW, &termios_p);

////////////////////////
// Close the file
////////////////////////
close( fd_ser );
fclose(fp_file);

printf("GPS process done!\n");

return EXIT_SUCCESS;
}
Last edited by rnyter on Mon Aug 31, 2009 2:56 pm, edited 1 time in total.
rnyter
Active Member
 
Posts: 24
Joined: Fri Jun 26, 2009 9:27 pm

RE: Core dump issue

Postby mario » Mon Aug 31, 2009 3:01 am

were is the fclose().

What is in buffer. Why use "%s" instead of usin buffer.

When posting code post complete code.
mario
QNX Master
 
Posts: 4132
Joined: Sun Sep 01, 2002 1:04 am

Re: RE: Core dump issue

Postby rnyter » Mon Aug 31, 2009 3:04 pm

mario wrote:were is the fclose().

What is in buffer. Why use "%s" instead of usin buffer.

When posting code post complete code.



Sorry about not describe the problem clearly. I have posted the complete code.

The data in the buffer is something like:
%INSPVASA,1546,506648.111;1546,506648.111248050,41.836873263,-87.627346927,156.220162736,-0.049152740,0.008858111,-0.394364234,0.796106837,-1.517555203,131.984234026,INS_ALIGNMENT_COMPLETE*743ffb22

I use "%s" because I need to write the data formatly into the text file later.
rnyter
Active Member
 
Posts: 24
Joined: Fri Jun 26, 2009 9:27 pm

Postby Tim » Mon Aug 31, 2009 4:44 pm

Rnyter,

You might consider giving fwrite a try and see if that fixes the problem.

Tim
Tim
Senior Member
 
Posts: 1511
Joined: Wed Mar 10, 2004 12:28 am

Postby mario » Mon Aug 31, 2009 7:39 pm

I could point out lines of code but let's generalize and develop your debugging skills. You have made the assumption, in many places, that when a message is received from the serial port that it's NULL terminated, but it's not.

For example you cannot do a strcat after a readconf because the received data doesn't end with a 0 ( null terminator ). It wouldn't be as bad if you fill the receive buffer with 0 before calling readconf (which the code sometimes does). Personnaly I wouldn't do that that because it hurt performance. Instead I would rely on the size of the data received as returned by readconf and put a 0 at the end of the data block, making it a real and safe string.

You said "I use "%s" because I need to write the data formatly into the text file later". I don't understand that this has do to with it. fprintf( fd, "%s", buffer ) generates the exact same results as fprintf( fd, buffer ) however I could be wrong and the %s may be faster. Not sure anymore ;-) That being said printf("%s\n", buffer) is much slower then puts(buffer)

Also *never* do a memset() with constant for size. Always use a sizeof.

There is a while(1) which no never get out of, yet there is clean up code afterward??
mario
QNX Master
 
Posts: 4132
Joined: Sun Sep 01, 2002 1:04 am

Postby rnyter » Mon Aug 31, 2009 8:54 pm

Tim wrote:Rnyter,

You might consider giving fwrite a try and see if that fixes the problem.

Tim


Thanks Tim, I will try that.
Last edited by rnyter on Tue Sep 01, 2009 4:47 pm, edited 2 times in total.
rnyter
Active Member
 
Posts: 24
Joined: Fri Jun 26, 2009 9:27 pm

Postby rnyter » Mon Aug 31, 2009 9:39 pm

mario wrote:I could point out lines of code but let's generalize and develop your debugging skills. You have made the assumption, in many places, that when a message is received from the serial port that it's NULL terminated, but it's not.

For example you cannot do a strcat after a readconf because the received data doesn't end with a 0 ( null terminator ). It wouldn't be as bad if you fill the receive buffer with 0 before calling readconf (which the code sometimes does). Personnaly I wouldn't do that that because it hurt performance. Instead I would rely on the size of the data received as returned by readconf and put a 0 at the end of the data block, making it a real and safe string.

You said "I use "%s" because I need to write the data formatly into the text file later". I don't understand that this has do to with it. fprintf( fd, "%s", buffer ) generates the exact same results as fprintf( fd, buffer ) however I could be wrong and the %s may be faster. Not sure anymore ;-) That being said printf("%s\n", buffer) is much slower then puts(buffer)

Also *never* do a memset() with constant for size. Always use a sizeof.

There is a while(1) which no never get out of, yet there is clean up code afterward??


Thanks for your detailed and patient reply, Mario.

My apology for my bad style of coding. I will try to improve me skills. I know that there is a terminator, something like \r\n, at the end of the received data. However during the configuration stage of the GPS, we need to send a lot of logs over the serial port. And each time we will get some responses like "OK" or "OK [COM1]" or else and then followed by the data string. The problem is that there are also the same terminator before and after these response. The only difference between those response and the data string from GPS is there is a '*' in the data string, right before the CRC. That's why I tried to use the '*' as a mark and read the remaining CRC and do a strcat.

You are right that it is a bad idea to do these so in another version, when it comes to the data collecting stage (without sending logs), I unraw the serial port, in another version, so it will read the data until it meets a terminator. But I think that would not make any differences because the memory fault occurs when calling the _Lockfilelock().

I don't know whether it would be faster to use fprintf(fd, "%s", buffer) than fprintf(fd, buffer) or not. But I know I will need to use fprintf(fd, "%s, %s, ...", gps_speed, gps_velocity, ...) later. So if there is a problem when using "%s" I think I still need to figure it out what the problem is.

About the memset(), it works. But you're right, I should be more careful about this.

For the while(1). In the new version, it creates a thread to break the loop and do the remaining clean up things. Since I want to make the description clearer, I posted this old version code. The same problem happens sometimes when running with the new version.

Thanks again!
rnyter
Active Member
 
Posts: 24
Joined: Fri Jun 26, 2009 9:27 pm

Postby mario » Mon Aug 31, 2009 10:02 pm

rnyter wrote:
Thanks for your detailed and patient reply, Mario.

My apology for my bad style of coding. I will try to improve me skills. I know that there is a terminator, something like \r\n, at the end of the received data.


This terminator means nothing to the C language. Only a 0 byte can be use as a string terminator.
rnyter wrote:
However during the configuration stage of the GPS, we need to send a lot of logs over the serial port. And each time we will get some responses like "OK" or "OK [COM1]" or else and then followed by the data string. The problem is that there are also the same terminator before and after these response. The only difference between those response and the data string from GPS is there is a '*' in the data string, right before the CRC. That's why I tried to use the '*' as a mark and read the remaining CRC and do a strcat.


Just as I mention above, you can't use a strcat because the data received does not contain a 0 at the end of the block.

rnyter wrote:You are right that it is a bad idea to do these so in another version, when it comes to the data collecting stage (without sending logs), I unraw the serial port, in another version, so it will read the data until it meets a terminator. But I think that would not make any differences because the memory fault occurs when calling the _Lockfilelock().



The fprintf function is being use all over the world. It is very much extremely unlikely to be buggy.... The way you handle "strings" is most definitely causing memory corruption which ends up crashing fprintf.

To be more clear :

char buffer[200];

This statement allocates space on the stack, since it's local, the standard say the content is undefined.

then readconf( buffer.....);

Say it read 4 bytes "OK\n\r". If we'd look in the memory area where buffer lives we could see something like:

O K \n \r & @ D L F ! 2 ..... and so forth.

Notice the 4 first bytes are the received data, and the rest of the array is full of garbage, and what do you know none of them are 0. That means the string is very long, in fact we don't even know how long it is. The 0 that terminates the string could end up being 1 Meg pass buffer, creating a 1 Meg string...

char buffer2[200];
buffer2[0] = 0; // create an empty string

strcpy ( buffer2, buffer );

Now what do you thing is going to happen? If you are lucking you get a SIGSEGV if not you get fprintf that crashes ;-)
mario
QNX Master
 
Posts: 4132
Joined: Sun Sep 01, 2002 1:04 am

Postby Tim » Tue Sep 01, 2009 12:54 pm

Mario,

In his main loop he definitely has terminators thanks to the memsets so he can do the strcat call.

Code: Select all
while(1)
{
readcond(fd_ser, rbuffer, sizeof(rbuffer), 0, 0, 10);
readcond(fd_ser, rbuffer_crc, 10, 10, 0, 10);
strcat(rbuffer, rbuffer_crc);

////////////////////////////////////////////////
//Write directly into GPSdata.txt
////////////////////////////////////////////////
fprintf(fp_file, "%s", rbuffer);

printf("******Data from GPS******\n%s\n", rbuffer);
memset(rbuffer, 0, 300); //clear buffer
memset(rbuffer_crc, 0, 12);
}


So unless the fprintf is crashing on the 1st time through the main loop (something he can check by incrementing a variable or he can just move the memsets to just before the readcond calls) it's something else entirely that is causing the problem.

That aside, all your comments about his code is excellent advice.

Tim
Tim
Senior Member
 
Posts: 1511
Joined: Wed Mar 10, 2004 12:28 am

Postby mario » Tue Sep 01, 2009 3:08 pm

[quote="Tim"]Mario,

In his main loop he definitely has terminators thanks to the memsets so he can do the strcat call.

[code]

Yes but it's AFTER the first readconf. On the first call to readconf it's already too late.
mario
QNX Master
 
Posts: 4132
Joined: Sun Sep 01, 2002 1:04 am

Postby rnyter » Tue Sep 01, 2009 4:37 pm

mario wrote:
Tim wrote:Mario,

In his main loop he definitely has terminators thanks to the memsets so he can do the strcat call.

[code]

Yes but it's AFTER the first readconf. On the first call to readconf it's already too late.


Maybe I am wrong but when I defined the rbuffer[300] and rbuffer_crc[12], I initialized them as empty arrays. I think there are also 0s at the end of these buffers, actually there are 300 0s and 12 0s separately.
rnyter
Active Member
 
Posts: 24
Joined: Fri Jun 26, 2009 9:27 pm

Postby mario » Tue Sep 01, 2009 4:59 pm

rnyter wrote:Maybe I am wrong but when I defined the rbuffer[300] and rbuffer_crc[12], I initialized them as empty arrays. I think there are also 0s at the end of these buffers, actually there are 300 0s and 12 0s separately.


You're right, I missed it. Hum then change the fprintf to something like fprintf(rbuffer, "%s", "a know string" ); If it doesn't crash then there is something wrong with rbuffer. If it still crashes, replace readcond() with a strcpy of what you expect to received. I'm still convince something gets corrupted, don't see what though.

You should check the return value of the open of the serial port. Also check the return value of fprintf ( what if the disk gets full).

Before the fprintf, print the value of fd_file to make sure it doesn't get corrupted.

write(fd_ser, wbuffer, sizeof(wbuffer)); should be
write(fd_ser,wbuffer, strlen ( wbuffer ) ); otherwise it's sending extra 0.
Or don't put a size in [] and do a sizeof(wbuffer)-1. The -1 is to not send the extra 0.
mario
QNX Master
 
Posts: 4132
Joined: Sun Sep 01, 2002 1:04 am

Postby rnyter » Tue Sep 01, 2009 5:31 pm

mario wrote:
You're right, I missed it. Hum then change the fprintf to something like fprintf(rbuffer, "%s", "a know string" ); If it doesn't crash then there is something wrong with rbuffer. If it still crashes, replace readcond() with a strcpy of what you expect to received. I'm still convince something gets corrupted, don't see what though.

You should check the return value of the open of the serial port. Also check the return value of fprintf ( what if the disk gets full).

Before the fprintf, print the value of fd_file to make sure it doesn't get corrupted.

write(fd_ser, wbuffer, sizeof(wbuffer)); should be
write(fd_ser,wbuffer, strlen ( wbuffer ) ); otherwise it's sending extra 0.
Or don't put a size in [] and do a sizeof(wbuffer)-1. The -1 is to not send the extra 0.


Thanks, I will try that. But the problem is I don't know when the error will occur. I mean the code I posted usually worked well. It was collecting data and writing into the text file for ten minutes even half an hour. But sometimes, which I cannot predict, a Memory fault occured. The only thing I know is the error occured when fprintf called the _Lockfilelock(). Another thing I know that is, since I print all the data on the screen rows by rows, when core dumped, the rows of the data in the text file are less than the rows on the screen. Consider the _lockfilelock(), maybe sometimes, somehow the file is locked but the program keep trying to write data into it, after maybe ten times trying, the program failed, and core dumped?
rnyter
Active Member
 
Posts: 24
Joined: Fri Jun 26, 2009 9:27 pm

Postby mario » Tue Sep 01, 2009 5:49 pm

Even if locking failed for some reason I wouldn't expect it to crash, fprintf would return -1 or something like that.

The problem may occur rarely because it depends no what the GPS returns.

Yes the textfile is using fprintf and as such is buffered try putting a fflush() afterthe fprintf or setit to non buffered with setvbuf. I think unlike the console a /n doesn't cause data to be flushed to a file.

If you are using 6.4.1 you could try using mudflap ( check the doc ).
mario
QNX Master
 
Posts: 4132
Joined: Sun Sep 01, 2002 1:04 am

Postby Tim » Tue Sep 01, 2009 6:36 pm

rnyter,

Did the fwrite help?

Also when the crash occurs and you look at the core dump when you walk back up the stack what does rbuffer and rbuffer_crc contain? Knowing that would go a long way toward determining what exactly is happening in terms of corruption (which I agree with Mario must be happening but isn't entirely clear to me either where it does).

Late note here:

I just looked at readcond in the help doc's.

The case where the last 3 arguments are (0,0,t) is marked as RESERVED. Thus I assume this means an indeterminate behavior. Your main readcond uses that exact argument format:

readcond(fd_ser, rbuffer, sizeof(rbuffer), 0, 0, 10);

Maybe it's as simple as changing that is all you need.

Tim
Tim
Senior Member
 
Posts: 1511
Joined: Wed Mar 10, 2004 12:28 am

Next

Return to Help

Who is online

Users browsing this forum: Google Adsense [Bot] and 4 guests