Forums » Software Development »
Rocko, possible memory leak?
Added by V J almost 5 years ago
Hi
I have built the Rocko version with 4.9.76-rt61-ltsi RT-kernel and with a custom FPGA and device tree. We use two Ethernet ports.
Running "free" shows that memory used is increasing with approximately 2.4 MB each hour. I.e. it seems that there is a memory leak here.
Test start at time at 01:30:
root@mitysom-c5:~# date;free;ps aux --sort -rss | head -40 Mon Mar 9 01:30:40 UTC 2019 total used free shared buff/cache available Mem: 768732 16280 730960 176 21492 740344 Swap: 0 0 0 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1083 0.0 0.3 14576 2824 ? Ssl 01:30 0:00 /usr/sbin/tcf-agent -d -L- -l0 root 1102 0.1 0.3 3296 2576 ttyS0 Ss 01:30 0:00 -sh 29 1017 0.0 0.2 2928 2180 ? Ss 01:30 0:00 /usr/sbin/rpc.statd root 1062 0.4 0.2 2576 2072 ? Ss 01:30 0:00 /sbin/klogd root 1088 0.0 0.2 2828 2060 ? Ss 01:30 0:00 /bin/sh /bin/start_getty 115200 ttyS0 vt102 root 809 0.3 0.2 3184 1732 ? Ss 01:30 0:00 /sbin/udevd -d root 1109 0.0 0.2 3176 1712 ttyS0 R+ 01:30 0:00 ps aux --sort -rss root 1045 0.0 0.2 3108 1632 ? Ss 01:30 0:00 /usr/sbin/rpc.mountd root 1008 0.0 0.2 4916 1628 ? Ss 01:30 0:00 /usr/sbin/sshd rpc 1012 0.0 0.2 2212 1620 ? Ss 01:30 0:00 /usr/sbin/rpcbind message+ 959 0.0 0.2 3212 1572 ? Ss 01:30 0:00 /usr/bin/dbus-daemon --system root 1065 0.0 0.1 2496 1524 ? Ss 01:30 0:00 /usr/sbin/crond root 1022 0.0 0.1 2172 1492 ? S 01:30 0:00 /usr/sbin/atd -f root 1089 0.0 0.1 2732 1484 tty1 Ss+ 01:30 0:00 /sbin/getty 38400 tty1 root 1060 0.1 0.1 1944 1428 ? Ss 01:30 0:00 /sbin/syslogd root 1 3.2 0.1 1716 1172 ? Ss 01:30 0:01 init [5] root 1071 0.0 0.1 1804 1064 ? S 01:30 0:00 /usr/sbin/ifplugd -i eth0 -fwI -u0 -d10 root 952 0.0 0.1 2732 888 ? Ss 01:30 0:00 udhcpc -R -b -p /var/run/udhcpc.eth0.pid -i eth0 root 1110 0.0 0.1 1728 880 ttyS0 S+ 01:30 0:00 head -40 root 2 0.0 0.0 0 0 ? S 01:30 0:00 [kthreadd] root 3 0.1 0.0 0 0 ? S 01:30 0:00 [ksoftirqd/0] root 4 0.1 0.0 0 0 ? S 01:30 0:00 [ktimersoftd/0] root 5 0.0 0.0 0 0 ? S 01:30 0:00 [kworker/0:0] root 6 0.0 0.0 0 0 ? S< 01:30 0:00 [kworker/0:0H] root 7 0.0 0.0 0 0 ? S 01:30 0:00 [kworker/u4:0]
Test end at time at 06:03:
root@mitysom-c5:~# date;free;ps aux --sort -rss | head -40 Mon Mar 9 06:03:49 UTC 2019 total used free shared buff/cache available Mem: 768732 27116 720044 200 21572 729544 Swap: 0 0 0 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1083 0.0 0.3 14576 2824 ? Ssl 01:30 0:05 /usr/sbin/tcf-agent -d -L- -l0 root 1102 0.0 0.3 3296 2576 ttyS0 Ss 01:30 0:00 -sh 29 1017 0.0 0.2 2928 2180 ? Ss 01:30 0:00 /usr/sbin/rpc.statd root 1062 0.0 0.2 2576 2072 ? Ss 01:30 0:00 /sbin/klogd root 1088 0.0 0.2 2828 2060 ? Ss 01:30 0:00 /bin/sh /bin/start_getty 115200 ttyS0 vt102 root 1148 0.0 0.2 3176 1776 ttyS0 R+ 06:03 0:00 ps aux --sort -rss root 809 0.0 0.2 3184 1732 ? Ss 01:30 0:00 /sbin/udevd -d root 1045 0.0 0.2 3108 1632 ? Ss 01:30 0:00 /usr/sbin/rpc.mountd root 1008 0.0 0.2 4916 1628 ? Ss 01:30 0:00 /usr/sbin/sshd rpc 1012 0.0 0.2 2212 1620 ? Ss 01:30 0:00 /usr/sbin/rpcbind message+ 959 0.0 0.2 3212 1572 ? Ss 01:30 0:00 /usr/bin/dbus-daemon --system root 1065 0.0 0.1 2496 1524 ? Ss 01:30 0:00 /usr/sbin/crond root 1022 0.0 0.1 2172 1492 ? S 01:30 0:00 /usr/sbin/atd -f root 1089 0.0 0.1 2732 1484 tty1 Ss+ 01:30 0:00 /sbin/getty 38400 tty1 root 1060 0.0 0.1 1944 1428 ? Ss 01:30 0:00 /sbin/syslogd root 1 0.0 0.1 1716 1172 ? Ss 01:30 0:01 init [5] root 1071 0.0 0.1 1804 1064 ? S 01:30 0:00 /usr/sbin/ifplugd -i eth0 -fwI -u0 -d10 root 952 0.0 0.1 2732 888 ? Ss 01:30 0:00 udhcpc -R -b -p /var/run/udhcpc.eth0.pid -i eth0 root 1149 0.0 0.1 1728 884 ttyS0 S+ 06:03 0:00 head -40 root 2 0.0 0.0 0 0 ? S 01:30 0:00 [kthreadd] root 3 0.1 0.0 0 0 ? S 01:30 0:20 [ksoftirqd/0] root 4 0.1 0.0 0 0 ? S 01:30 0:18 [ktimersoftd/0] root 5 0.0 0.0 0 0 ? S 01:30 0:00 [kworker/0:0] root 6 0.0 0.0 0 0 ? S< 01:30 0:00 [kworker/0:0H] root 8 0.0 0.0 0 0 ? S 01:30 0:02 [rcu_preempt]
During these 4hours and 30min approximately 11MB (27116-16280) memory is "lost".
There is only a small amount of memory difference caused by "shared" and "buff/cache".
"ps" show no increase in memory use for the running applications.
Only the "default" applications are running when testing this.
Note: We have allocated 768MB (and not 1GB) for linux, the rest is allocated to the FPGA.
To see the impact this memory issues has on the system I have run a test where the allocated memory for linux has been deceased from 758MB to 75MB.
Result: After a few hours the system becomes unresponsive, probably because of lack of memory.
Are you aware of any memory leak issues on either Rocko, kernel 4.9 or RT?
I do not see this behavior on the original distribution which came with evaluation board.
BR
VJ
Replies (10)
RE: Rocko, possible memory leak? - Added by Daniel Vincelette almost 5 years ago
Hello VJ,
We are not aware of memory leaks on the rocko or the 4.9 kernel. Did you by chance run the original distribution while using the 4.9 kernel (so older filesystem with newer kernel)?
I have a test running on my bench to recreate what you are seeing now.
Dan
RE: Rocko, possible memory leak? - Added by V J almost 5 years ago
Hello Dan, thank you for your reply.
We have tested the original SD card where we replaced kernel 3.16 with 4.9.78-ltsi (non RT). We also needed to use a device tree from the rocko build to be able to boot properly.
The results are the same as with the rocko build, memory is “lost”:
Test start, uptime 36 min:
root@mitysom-5csx:~# uptime;free;ps aux --sort -rss | head -n 30 23:32:36 up 36 min, 1 user, load average: 0.00, 0.00, 0.00 total used free shared buffers cached Mem: 1029480 45928 983552 0 4824 12384 -/+ buffers/cache: 28720 1000760 Swap: 0 0 0 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1072 0.0 0.2 4152 2516 ? Ss 22:56 0:00 /usr/sbin/sshd root 1158 0.0 0.2 2840 2480 ttyS0 Ss 22:56 0:00 -sh avahi 1115 0.0 0.2 3248 2472 ? S 22:56 0:00 avahi-daemon: running [mitysom-5csx.local] root 1108 0.0 0.2 2660 2132 ? Ss 22:56 0:00 /sbin/klogd root 793 0.0 0.2 2800 2072 ? Ss 22:55 0:00 /lib/udev/udevd -d root 906 0.0 0.1 2728 1776 ? S 22:55 0:00 /lib/udev/udevd -d root 905 0.0 0.1 2728 1700 ? S 22:55 0:00 /lib/udev/udevd -d root 1038 0.0 0.1 2224 1624 ? Ss 22:56 0:00 /usr/sbin/rpcbind root 1188 0.0 0.1 2324 1592 ttyS0 R+ 23:32 0:00 ps aux --sort -rss 997 1065 0.0 0.1 2608 1564 ? Ss 22:56 0:00 /usr/bin/dbus-daemon --system root 1145 0.0 0.1 2468 1512 ? Ss 22:56 0:00 /usr/sbin/crond root 1097 0.0 0.1 2368 1460 ? Ss 22:56 0:00 /usr/sbin/rpc.mountd -f /etc/exports root 1055 0.0 0.1 1856 1416 ? S 22:56 0:00 /usr/sbin/atd -f root 1099 0.0 0.1 1912 1408 ? Ss 22:56 0:00 /usr/sbin/rpc.statd root 1106 0.0 0.1 1776 1388 ? Ss 22:56 0:00 /sbin/syslogd avahi 1116 0.0 0.1 3104 1336 ? S 22:56 0:00 avahi-daemon: chroot helper root 1159 0.0 0.1 2196 1324 tty1 Ss+ 22:56 0:00 /sbin/getty 38400 tty1 root 1130 0.0 0.1 1932 1264 ? S 22:56 0:00 /usr/sbin/avahi-dnsconfd -D root 1142 0.0 0.1 2232 1260 ? S 22:56 0:00 /usr/sbin/lighttpd -f /etc/lighttpd.conf root 1 0.0 0.1 1688 1176 ? Ss 22:55 0:01 init [5] root 1151 0.0 0.1 1740 1072 ? S 22:56 0:00 /usr/sbin/ifplugd -i eth0 -fwI -u0 -d10 root 1189 0.0 0.0 1696 936 ttyS0 S+ 23:32 0:00 head -n 30 root 1033 0.0 0.0 2192 752 ? Ss 22:56 0:00 udhcpc -R -n -p /var/run/udhcpc.eth0.pid -i eth0 root 2 0.0 0.0 0 0 ? S 22:55 0:00 [kthreadd] root 3 0.0 0.0 0 0 ? S 22:55 0:00 [ksoftirqd/0] root 5 0.0 0.0 0 0 ? S< 22:55 0:00 [kworker/0:0H] root 7 0.0 0.0 0 0 ? S 22:55 0:00 [rcu_sched] root 8 0.0 0.0 0 0 ? S 22:55 0:00 [rcu_bh] root 9 0.0 0.0 0 0 ? S 22:55 0:00 [migration/0]
Test stop, uptime 3h 24 min:
root@mitysom-5csx:~# uptime;free;ps aux --sort -rss | head -n 30 02:19:56 up 3:24, 1 user, load average: 0.00, 0.00, 0.00 total used free shared buffers cached Mem: 1029480 52896 976584 0 4924 12488 -/+ buffers/cache: 35484 993996 Swap: 0 0 0 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1072 0.0 0.2 4152 2516 ? Ss Nov05 0:00 /usr/sbin/sshd root 1158 0.0 0.2 2844 2484 ttyS0 Ss Nov05 0:00 -sh avahi 1115 0.0 0.2 3248 2476 ? S Nov05 0:01 avahi-daemon: running [mitysom-5csx.local] root 1108 0.0 0.2 2660 2132 ? Ss Nov05 0:00 /sbin/klogd root 793 0.0 0.2 2800 2072 ? Ss Nov05 0:00 /lib/udev/udevd -d root 906 0.0 0.1 2728 1776 ? S Nov05 0:00 /lib/udev/udevd -d root 905 0.0 0.1 2728 1700 ? S Nov05 0:00 /lib/udev/udevd -d root 1038 0.0 0.1 2224 1624 ? Ss Nov05 0:00 /usr/sbin/rpcbind root 1236 0.0 0.1 2324 1604 ttyS0 R+ 02:19 0:00 ps aux --sort -rss 997 1065 0.0 0.1 2608 1564 ? Ss Nov05 0:00 /usr/bin/dbus-daemon --system root 1145 0.0 0.1 2468 1512 ? Ss Nov05 0:00 /usr/sbin/crond root 1097 0.0 0.1 2368 1460 ? Ss Nov05 0:00 /usr/sbin/rpc.mountd -f /etc/exports root 1055 0.0 0.1 1856 1416 ? S Nov05 0:00 /usr/sbin/atd -f root 1099 0.0 0.1 1912 1408 ? Ss Nov05 0:00 /usr/sbin/rpc.statd root 1106 0.0 0.1 1776 1388 ? Ss Nov05 0:00 /sbin/syslogd avahi 1116 0.0 0.1 3104 1336 ? S Nov05 0:00 avahi-daemon: chroot helper root 1159 0.0 0.1 2196 1324 tty1 Ss+ Nov05 0:00 /sbin/getty 38400 tty1 root 1130 0.0 0.1 1932 1264 ? S Nov05 0:00 /usr/sbin/avahi-dnsconfd -D root 1142 0.0 0.1 2232 1260 ? S Nov05 0:00 /usr/sbin/lighttpd -f /etc/lighttpd.conf root 1 0.0 0.1 1688 1176 ? Ss Nov05 0:01 init [5] root 1151 0.0 0.1 1740 1072 ? S Nov05 0:00 /usr/sbin/ifplugd -i eth0 -fwI -u0 -d10 root 1237 0.0 0.0 1696 916 ttyS0 S+ 02:19 0:00 head -n 30 root 1033 0.0 0.0 2192 752 ? Ss Nov05 0:00 udhcpc -R -n -p /var/run/udhcpc.eth0.pid -i eth0 root 2 0.0 0.0 0 0 ? S Nov05 0:00 [kthreadd] root 3 0.0 0.0 0 0 ? S Nov05 0:00 [ksoftirqd/0] root 5 0.0 0.0 0 0 ? S< Nov05 0:00 [kworker/0:0H] root 7 0.0 0.0 0 0 ? S Nov05 0:02 [rcu_sched] root 8 0.0 0.0 0 0 ? S Nov05 0:00 [rcu_bh] root 9 0.0 0.0 0 0 ? S Nov05 0:00 [migration/0]
During these 2h 48min approximately 7MB (52896-45928) memory is "lost". I.e. 2.5MB/hour.
BR
VJ
RE: Rocko, possible memory leak? - Added by Daniel Vincelette almost 5 years ago
I enabled CONFIG_DEBUG_KMEMLEAK in the 4.9 kernel, which is supposed to report any kernel memory leaks it finds. I ran it all night and it didn't report any. I am seeing about a 1MB leak an hour through free though.
This leads me to believe it's something on the filesystem causing this issue. My next option is to try the warrior or zeus branch of yocto to see if the issue still persists.
Dan
RE: Rocko, possible memory leak? - Added by V J almost 5 years ago
Hi Dan
Thanks for looking into this.
I have run some test during the weekend, and the results are a bit inconclusive. However, it seems that Ethernet has some impact:
Tests where "free" output is logged:
1: No Ethernet cables connected. No increased memory use.
2: Ethernet cables connected, but no dedicated traffic or connections (like ssh) to the board. Sometimes increased memory use, sometimes not.
3a: Ethernet cables connected, but no dedicated traffic or connections (like ssh) to the board, except that the board is pinged every sec. Increased memory use, ca 2.5 MB/hour
3b: Turning of pinging: The memory use stops.
The statistics behind these measurements are poor, so it is only observations, not conclusions.
BR
VJ
RE: Rocko, possible memory leak? - Added by Daniel Vincelette almost 5 years ago
Hi VJ
Those are definitely interesting findings leads me to believe like yourself some part of the ethernet subsystem is causing it. Would you say your test for 3b and 2 are the same? If not what are the differences?
I switched to the zeus branch for yocto, hoping it was an old bug in maybe busybox and still see the used memory increase. I'm trying to compile in some tools to give me some more insight.
Dan
RE: Rocko, possible memory leak? - Added by V J almost 5 years ago
Hi Dan
Yes, 2 and 3b are the same.
I have done some more monitoring where Eth0 is connected but not used and Eth1 is pinged every sec.
Next eth0 and eth1 are taken down after 11h 30m:
mitysom-c5:~# uptime;free 17:29:00 up 11:27, 1 user, load average: 0.00, 0.00, 0.00 total used free shared buff/cache available Mem: 768724 70256 651328 300 47140 683816 Swap: 0 0 mitysom-c5:~# ifconfig eth0 down mitysom-c5:~# uptime;free 17:30:47 up 11:29, 1 user, load average: 0.00, 0.00, 0.00 total used free shared buff/cache available Mem: 768724 65000 656608 304 47116 689068 Swap: 0 0 mitysom-c5:~# ifconfig eth1 down mitysom-c5:~# mitysom-c5:~# uptime;free 17:31:24 up 11:29, 1 user, load average: 0.00, 0.00, 0.00 total used free shared buff/cache available Mem: 768724 55624 666040 304 47060 698444 Swap: 0 0 0
This frees up approximately 5MB for eth0 and 10MB for eth1 so it seems that the ethernet driver is quite memory hungry...
Note: kmemleak is enabled so the used memory is a bit higher at startup compared to the previous measurements.
BR
VJ
RE: Rocko, possible memory leak? - Added by V J almost 5 years ago
Hi Dan.
I have monitored the memory use over the weekend. See attached plot where "free" is logged every 10 minutes.
Test steps:
1: Booting the board. Ethernet cables are connected to Eth0 and Eth1, but no dedicated traffic or connections (like ssh) to the board.
2: At approximately 1300 minutes (130*10): Starting Eth1 ping every 10ms.
3: At approximately 1800 minutes (180*10): Stopping ping.
It seems that the Ethernet requires a fair amount of memory which is allocated in a slowly increasing (and interesting?) pattern when starting pinging.
However, this memory is slowly freed when the pinging stops.
The first approximately 450 minutes (45*10) has a similar slowly increasing memory use which is partly freed.
Maybe this is just how the Ethernet interface works?
BR
VJ
picture95-1.png (6.03 KB) picture95-1.png | Used memory every 10 minutes |
RE: Rocko, possible memory leak? - Added by Michael Williamson almost 5 years ago
Hello Dan/VJ,
Have you guys seen this stack overflow post about tracking socket buffers /
memory moving through the network stacks?
Might provide some insight....
-Mike
On Mon, Mar 23, 2020 at 6:56 AM <redmine@criticallink.com> wrote:
RE: Rocko, possible memory leak? - Added by V J almost 5 years ago
Dan, based upon the measurements I have done and input from Mike I think I have to write this off as normal behavior.
I have done some stress test where I stream data over Ethernet and the memory use does not increase uncontrolled.
BR
VJ
RE: Rocko, possible memory leak? - Added by Daniel Vincelette almost 5 years ago
VJ,
I agree after looking at your data and with the knowledge that I've run 5CSX SOMs for weeks plugged into the network without having them dead locking.
It looks like the kernel uses a fair amount of memory for ethernet, though it can properly garbage collect it when need be.
Dan