lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2353c149-cae1-f986-63d0-3568534a1e8c@itcare.pl>
Date:   Mon, 11 Dec 2017 22:48:05 +0100
From:   Paweł Staszewski <pstaszewski@...are.pl>
To:     Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: Huge memory leak with 4.15.0-rc2+



W dniu 2017-12-11 o 22:23, Paweł Staszewski pisze:
> Hi
>
>
> I just upgraded some testing host to 4.15.0-rc2+ kernel
>
> And after some time of traffic processing - when traffic on all ports 
> reach about 3Mpps - memleak started.
>
> Graph attached from memory usage: https://ibb.co/idK4zb
>
>
>
> HW config:
>
> Intel E5
>
> 8x Intel 82599 (used ixgbe driver from kernel)
>
> Interfaces with vlans attached
>
> All 8 ethernet ports are in one LAG group configured by team.
>
> With current settings
>
> (this host is acting as a router - and bgpd process is eating same 
> amount of memory from the beginning about 5.2GB)
>
>  cat /proc/meminfo
> MemTotal:       32770588 kB
> MemFree:        11342492 kB
> MemAvailable:   10982752 kB
> Buffers:           84704 kB
> Cached:            83180 kB
> SwapCached:            0 kB
> Active:          5105320 kB
> Inactive:          46252 kB
> Active(anon):    4985448 kB
> Inactive(anon):     1096 kB
> Active(file):     119872 kB
> Inactive(file):    45156 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> SwapTotal:       4005280 kB
> SwapFree:        4005280 kB
> Dirty:               236 kB
> Writeback:             0 kB
> AnonPages:       4983752 kB
> Mapped:            13556 kB
> Shmem:              2852 kB
> Slab:            1013124 kB
> SReclaimable:      45876 kB
> SUnreclaim:       967248 kB
> KernelStack:        7152 kB
> PageTables:        12164 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    20390572 kB
> Committed_AS:     396568 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:           0 kB
> VmallocChunk:          0 kB
> HardwareCorrupted:     0 kB
> AnonHugePages:         0 kB
> ShmemHugePages:        0 kB
> ShmemPmdMapped:        0 kB
> CmaTotal:              0 kB
> CmaFree:               0 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:     1407572 kB
> DirectMap2M:    20504576 kB
> DirectMap1G:    13631488 kB
>
> ps aux --sort -rss
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> root      6758  1.8 14.9 5044996 4886964 ?     Sl   01:22  23:21 
> /usr/local/sbin/bgpd -d  -u root -g root -I --ignore_warnings
> root      6752  0.0  0.1  86272 61920 ?        Ss   01:22   0:16 
> /usr/local/sbin/zebra -d  -u root -g root -I --ignore_warnings
> root      6766 12.6  0.0  51592 29196 ?        S    01:22 157:48 
> /usr/sbin/snmpd -p /var/run/snmpd.pid -Ln
> root      7494  0.0  0.0 708976  5896 ?        Ssl  01:22   0:09 
> /opt/collectd/sbin/collectd
> root     15531  0.0  0.0  67864  5056 ?        Ss   21:57   0:00 sshd: 
> paol [priv]
> root      4915  0.0  0.0 271912  4904 ?        Ss   01:21   0:25 
> /usr/sbin/syslog-ng --persist-file 
> /var/lib/syslog-ng/syslog-ng.persist --cfgfile 
> /etc/syslog-ng/syslog-ng.conf --pidfile /run/syslog-ng.pid
> root      4278  0.0  0.0  37220  4164 ?        Ss   01:21   0:00 
> /lib/systemd/systemd-udevd --daemon
> root      5147  0.0  0.0  32072  3232 ?        Ss   01:21   0:00 
> /usr/sbin/sshd
> root      5203  0.0  0.0  28876  2436 ?        S    01:21   0:00 teamd 
> -d -f /etc/teamd.conf
> root     17372  0.0  0.0  17924  2388 pts/2    R+   22:13   0:00 ps 
> aux --sort -rss
> root      4789  0.0  0.0   5032  2176 ?        Ss   01:21   0:00 mdadm 
> --monitor --scan --daemonise --pid-file /var/run/mdadm.pid --syslog
> root      7511  0.0  0.0  12676  1920 tty4     Ss+  01:22   0:00 
> /sbin/agetty 38400 tty4 linux
> root      7510  0.0  0.0  12676  1896 tty3     Ss+  01:22   0:00 
> /sbin/agetty 38400 tty3 linux
> root      7512  0.0  0.0  12676  1860 tty5     Ss+  01:22   0:00 
> /sbin/agetty 38400 tty5 linux
> root      7513  0.0  0.0  12676  1836 tty6     Ss+  01:22   0:00 
> /sbin/agetty 38400 tty6 linux
> root      7509  0.0  0.0  12676  1832 tty2     Ss+  01:22   0:00 
> /sbin/agetty 38400 tty2 linux
>
> And latest kernel that everything was working is: 4.14.3
>
>
> Some observations - when i disable tso on all cards there is more 
> memleak.
>
>
>
>
>
When traffic starts to drop - there is less and less memleak
below link to memory usage graph:
https://ibb.co/hU97kG

And there is rising slab_unrecl - Amount of unreclaimable memory used 
for slab kernel allocations


Forgot to add that im using hfsc and qdiscs like pfifo on classes.







Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ