Problem with "System Information" in IDE

bridged with qnx.development_tools
Post Reply
Rennie Allen

Problem with "System Information" in IDE

Post by Rennie Allen » Wed Apr 20, 2005 10:54 pm

I am trying to track a memory leak in io-net; and I don't understand some of the data I am getting when using the System Information
perspective.

Host and target are both X86 QNX6.3SP1.

There is definately a leak in io-net (I can see it with "pidin pmem" and the "Memory Information" view) but when using the "Malloc
Information" view, there is no apparent leak, and the total heap size does not equate with the total heap size from either the
"Memory Information" view or "pidin pmem". For all other processes that I measure these two numbers do equate.

At this point I don't understand the discrepency between the heap size in the "Memory Information" and "Malloc Information" views, so
before I get too far into the specifics of the leak in io-net, I would like to understand this.

Why would these two numbers not agree ?

as an aside:

The leak in io-net may be related to the fact that the io-net in question (there are two separate io-nets), has the following
parameters passed to the tcpip shared object "rx_pulse_prio=9,timer_pulse_prio=9".

Thanks,

Rennie

Sean Boudreau

Re: Problem with "System Information" in IDE

Post by Sean Boudreau » Wed Apr 20, 2005 11:28 pm

I'm no ide expert but io-net and the stack get memory by other means
besides malloc(). eg mmap(). Can you post a test case that tickles
the leak?

-seanb

Rennie Allen <rallen@csical.com> wrote:
I am trying to track a memory leak in io-net; and I don't understand some of the data I am getting when using the System Information
perspective.

Host and target are both X86 QNX6.3SP1.

There is definately a leak in io-net (I can see it with "pidin pmem" and the "Memory Information" view) but when using the "Malloc
Information" view, there is no apparent leak, and the total heap size does not equate with the total heap size from either the
"Memory Information" view or "pidin pmem". For all other processes that I measure these two numbers do equate.

At this point I don't understand the discrepency between the heap size in the "Memory Information" and "Malloc Information" views, so
before I get too far into the specifics of the leak in io-net, I would like to understand this.

Why would these two numbers not agree ?

as an aside:

The leak in io-net may be related to the fact that the io-net in question (there are two separate io-nets), has the following
parameters passed to the tcpip shared object "rx_pulse_prio=9,timer_pulse_prio=9".

Thanks,

Rennie

Rennie Allen

Re: Problem with "System Information" in IDE

Post by Rennie Allen » Thu Apr 21, 2005 4:53 pm

Sean Boudreau wrote:
I'm no ide expert but io-net and the stack get memory by other means
besides malloc(). eg mmap(). Can you post a test case that tickles
the leak?
OK, so io-net essentially has it's own heap manager that the IDE can't query for statistics. Would it be possible for the io-net
boyz and the qconn boyz to get together and allow the "System Information" perspective to query the io-net "heap manager" so that the
malloc view and the memory view always agree on how much heap there is ? (I think these 2 views should always agree)

As for the actual leak, I don't think you need any special code to reproduce it. It appears that any sockets based code will exhibit
the leak in io-net if the rx_pulse_prio and timer_pulse_prio parameters (of npm-tcpip.so) are lower than at least 1 thread that is a
client of io-net. We have socket code here written by 3 different individuals for completely different applications (some ported
from other platforms where they have run for years), and as long as the above conditions hold, they result in a io-net leak when run.

It appears that the magnitude of the leak will increase in inverse relation to the size of the timeout passed to select by the
(higher priority) thread in the socket appplication.

From this black-box testing, it appears that some sort of "garbage collection" is done in response to the timer pulse, is this true ?
(it doesn't appear that it could be the rx pulse, since the rx pulse is clearly getting sufficient CPU - i.e. incoming data is being
read successfully in the presence of the leak).

If the garbage collection is done in the timer pulse handler, it seems that it might be better to move it out of there, and have it
in either the rx pulse handler or in the client handler path (where it will have client priority). As a real-timer developer, I want
to be able to move as much of the processing as possible to the lowest priority possible, while maintaining correct operation.

If there is no garbage collection in the timer pulse handler, then something more subtle is going on, and I am going to need some
help (from those with access to the source) understanding what it is.

PS: If I assign higher priority to the timer thread the leak disappears (just stating this for clarity, even though it is implied
above).

Rennie

Post Reply

Return to “qnx.development_tools”