lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0Ud=GRrsNtoM0gOgzsh1uPJ7nuT9S-xbigmbUMi+iz-ufw@mail.gmail.com>
Date:   Wed, 18 Oct 2017 16:37:30 -0700
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     Paweł Staszewski <pstaszewski@...are.pl>
Cc:     Pavlos Parissis <pavlos.parissis@...il.com>,
        "Anders K. Pedersen | Cohaesio" <akp@...aesio.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
        "alexander.h.duyck@...el.com" <alexander.h.duyck@...el.com>
Subject: Re: Linux 4.12+ memory leak on router with i40e NICs

On Wed, Oct 18, 2017 at 4:22 PM, Paweł Staszewski <pstaszewski@...are.pl> wrote:
>
>
> W dniu 2017-10-19 o 00:58, Paweł Staszewski pisze:
>
>>
>>
>> W dniu 2017-10-19 o 00:50, Paweł Staszewski pisze:
>>>
>>>
>>>
>>> W dniu 2017-10-19 o 00:20, Paweł Staszewski pisze:
>>>>
>>>>
>>>>
>>>> W dniu 2017-10-18 o 17:44, Paweł Staszewski pisze:
>>>>>
>>>>>
>>>>>
>>>>> W dniu 2017-10-17 o 16:08, Paweł Staszewski pisze:
>>>>>>
>>>>>>
>>>>>>
>>>>>> W dniu 2017-10-17 o 13:52, Paweł Staszewski pisze:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> W dniu 2017-10-17 o 13:05, Paweł Staszewski pisze:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> W dniu 2017-10-17 o 12:59, Paweł Staszewski pisze:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> W dniu 2017-10-17 o 12:51, Paweł Staszewski pisze:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> W dniu 2017-10-17 o 12:20, Paweł Staszewski pisze:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> W dniu 2017-10-17 o 11:48, Paweł Staszewski pisze:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> W dniu 2017-10-17 o 02:44, Paweł Staszewski pisze:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> W dniu 2017-10-17 o 01:56, Alexander Duyck pisze:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Oct 16, 2017 at 4:34 PM, Paweł Staszewski
>>>>>>>>>>>>>> <pstaszewski@...are.pl> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> W dniu 2017-10-16 o 18:26, Paweł Staszewski pisze:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> W dniu 2017-10-16 o 13:20, Pavlos Parissis pisze:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 15/10/2017 02:58 πμ, Alexander Duyck wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Pawel,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> To clarify is that Dave Miller's tree or Linus's that you
>>>>>>>>>>>>>>>>>> are talking
>>>>>>>>>>>>>>>>>> about? If it is Dave's tree how long ago was it you pulled
>>>>>>>>>>>>>>>>>> it since I
>>>>>>>>>>>>>>>>>> think the fix was just pushed by Jeff Kirsher a few days
>>>>>>>>>>>>>>>>>> ago.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The issue should be fixed in the following commit:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/drivers/net/ethernet/intel/i40e/i40e_txrx.c?id=2b9478ffc550f17c6cd8c69057234e91150f5972
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Do you know when it is going to be available on net-next
>>>>>>>>>>>>>>>>> and linux-stable
>>>>>>>>>>>>>>>>> repos?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> Pavlos
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I will make some tests today night with "net" git tree where
>>>>>>>>>>>>>>>> this patch is
>>>>>>>>>>>>>>>> included.
>>>>>>>>>>>>>>>> Starting from 0:00 CET
>>>>>>>>>>>>>>>> :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Upgraded and looks like problem is not solved with that patch
>>>>>>>>>>>>>>> Currently running system with
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/
>>>>>>>>>>>>>>> kernel
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Still about 0.5GB of memory is leaking somewhere
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also can confirm that the latest kernel where memory is not
>>>>>>>>>>>>>>> leaking (with
>>>>>>>>>>>>>>> use i40e driver intel 710 cards) is 4.11.12
>>>>>>>>>>>>>>> With kernel 4.11.12 - after hour no change in memory usage.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> also checked that with ixgbe instead of i40e with same
>>>>>>>>>>>>>>> net.git kernel there
>>>>>>>>>>>>>>> is no memleak - after hour same memory usage - so for 100%
>>>>>>>>>>>>>>> this is i40e
>>>>>>>>>>>>>>> driver problem.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So how long was the run to get the .5GB of memory leaking?
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1 hour
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also is there any chance of you being able to bisect to
>>>>>>>>>>>>>> determine
>>>>>>>>>>>>>> where the memory leak was introduced since as you pointed out
>>>>>>>>>>>>>> it
>>>>>>>>>>>>>> didn't exist in 4.11.12 so odds are it was introduced
>>>>>>>>>>>>>> somewhere
>>>>>>>>>>>>>> between 4.11 and the latest kernel release.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can be hard cause currently need to back to 4.11.12 - this is
>>>>>>>>>>>>> production host/router
>>>>>>>>>>>>> Will try to find some free/test router for tests/bicects with
>>>>>>>>>>>>> i40e driver (intel 710 cards)
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> Also forgoto to add errors for i40e when driver initialize:
>>>>>>>>>>>> [   15.760569] i40e 0000:02:00.1: Error I40E_AQ_RC_ENOSPC adding
>>>>>>>>>>>> RX filters on PF, promiscuous mode forced on
>>>>>>>>>>>> [   16.365587] i40e 0000:03:00.3: Error I40E_AQ_RC_ENOSPC adding
>>>>>>>>>>>> RX filters on PF, promiscuous mode forced on
>>>>>>>>>>>> [   16.367686] i40e 0000:02:00.2: Error I40E_AQ_RC_ENOSPC adding
>>>>>>>>>>>> RX filters on PF, promiscuous mode forced on
>>>>>>>>>>>> [   16.368816] i40e 0000:03:00.0: Error I40E_AQ_RC_ENOSPC adding
>>>>>>>>>>>> RX filters on PF, promiscuous mode forced on
>>>>>>>>>>>> [   16.369877] i40e 0000:03:00.2: Error I40E_AQ_RC_ENOSPC adding
>>>>>>>>>>>> RX filters on PF, promiscuous mode forced on
>>>>>>>>>>>> [   16.370941] i40e 0000:02:00.3: Error I40E_AQ_RC_ENOSPC adding
>>>>>>>>>>>> RX filters on PF, promiscuous mode forced on
>>>>>>>>>>>> [   16.372005] i40e 0000:02:00.0: Error I40E_AQ_RC_ENOSPC adding
>>>>>>>>>>>> RX filters on PF, promiscuous mode forced on
>>>>>>>>>>>> [   16.373029] i40e 0000:03:00.1: Error I40E_AQ_RC_ENOSPC adding
>>>>>>>>>>>> RX filters on PF, promiscuous mode forced on
>>>>>>>>>>>>
>>>>>>>>>>>> some params that are set for this nic's
>>>>>>>>>>>>         ip link set up dev $i
>>>>>>>>>>>>         ethtool -A $i autoneg off rx off tx off
>>>>>>>>>>>>         ethtool -G $i rx 1024 tx 2048
>>>>>>>>>>>>         ip link set $i txqueuelen 1000
>>>>>>>>>>>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs
>>>>>>>>>>>> 512 tx-usecs 128
>>>>>>>>>>>>         ethtool -L $i combined 6
>>>>>>>>>>>>         #ethtool -N $i rx-flow-hash udp4 sdfn
>>>>>>>>>>>>         ethtool -K $i ntuple on
>>>>>>>>>>>>         ethtool -K $i gro off
>>>>>>>>>>>>         ethtool -K $i tso off
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Also after TSO/GRO on there is memory usage change - and leaking
>>>>>>>>>>> faster
>>>>>>>>>>> Below image from memory usage before change with TSO/GRO OFF and
>>>>>>>>>>> after enabling TSO/GRO
>>>>>>>>>>>
>>>>>>>>>>> https://ibb.co/dTqBY6
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Pawel
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> With settings like this:
>>>>>>>>>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1
>>>>>>>>>> enp3s0f2 enp3s0f3'
>>>>>>>>>> for i in $ifc
>>>>>>>>>>         do
>>>>>>>>>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 512
>>>>>>>>>> tx-usecs 128
>>>>>>>>>>         ethtool -K $i gro on
>>>>>>>>>>         ethtool -K $i tso on
>>>>>>>>>>
>>>>>>>>>>         done
>>>>>>>>>>
>>>>>>>>>> Server is leaking about 4-6MB per each 10 seconds
>>>>>>>>>> MEMLEAK:
>>>>>>>>>> 5  MB/10sec
>>>>>>>>>> 6  MB/10sec
>>>>>>>>>> 4  MB/10sec
>>>>>>>>>> 4  MB/10sec
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Other settings TSO/GRO off
>>>>>>>>>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1
>>>>>>>>>> enp3s0f2 enp3s0f3'
>>>>>>>>>> for i in $ifc
>>>>>>>>>>         do
>>>>>>>>>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 512
>>>>>>>>>> tx-usecs 128
>>>>>>>>>>         ethtool -K $i gro off
>>>>>>>>>>         ethtool -K $i tso off
>>>>>>>>>>
>>>>>>>>>>         done
>>>>>>>>>>
>>>>>>>>>> Same leak about 5MB per 10 seconds
>>>>>>>>>> MEMLEAK:
>>>>>>>>>> 5  MB/10sec
>>>>>>>>>> 5  MB/10sec
>>>>>>>>>> 5  MB/10sec
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Other settings rx-usecs change from 512 to 1024:
>>>>>>>>>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1
>>>>>>>>>> enp3s0f2 enp3s0f3'
>>>>>>>>>> for i in $ifc
>>>>>>>>>>         do
>>>>>>>>>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs
>>>>>>>>>> 1024 tx-usecs 128
>>>>>>>>>>         ethtool -K $i gro off
>>>>>>>>>>         ethtool -K $i tso off
>>>>>>>>>>
>>>>>>>>>>         done
>>>>>>>>>>
>>>>>>>>>> MEMLEAK:
>>>>>>>>>> 4  MB/10sec
>>>>>>>>>> 3  MB/10sec
>>>>>>>>>> 4  MB/10sec
>>>>>>>>>> 4  MB/10sec
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So memleak have something to do with rx-usecs (less interrupts but
>>>>>>>>>> bigger latency for traffic)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> But also enabling TSO/GRO making leak about 1MB bigger for each 10
>>>>>>>>>> seconds
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> So far best config is:
>>>>>>>>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2
>>>>>>>>> enp3s0f3'
>>>>>>>>> for i in $ifc
>>>>>>>>>         do
>>>>>>>>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 64
>>>>>>>>> tx-usecs 512
>>>>>>>>>         ethtool -K $i gro off
>>>>>>>>>         ethtool -K $i tso on
>>>>>>>>>
>>>>>>>>>         done
>>>>>>>>>
>>>>>>>>> MEMLEAK - about 2MB/10secs
>>>>>>>>> 2  MB/10sec
>>>>>>>>> 2  MB/10sec
>>>>>>>>> 2  MB/10sec
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> With - rx-usecs set to 256 (about 7-9MB/10secs memleak)
>>>>>>>>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2
>>>>>>>>> enp3s0f3'
>>>>>>>>> for i in $ifc
>>>>>>>>>         do
>>>>>>>>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 256
>>>>>>>>> tx-usecs 512
>>>>>>>>>         ethtool -K $i gro off
>>>>>>>>>         ethtool -K $i tso on
>>>>>>>>>
>>>>>>>>>         done
>>>>>>>>>
>>>>>>>>> MEMLEAK:
>>>>>>>>> 7  MB/10sec
>>>>>>>>> 7  MB/10sec
>>>>>>>>> 8  MB/10sec
>>>>>>>>> 9  MB/10sec
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> And even less memleak with rx-usecs set to 32
>>>>>>>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2
>>>>>>>> enp3s0f3'
>>>>>>>> for i in $ifc
>>>>>>>>         do
>>>>>>>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 32
>>>>>>>> tx-usecs 512
>>>>>>>>         ethtool -K $i gro off
>>>>>>>>         ethtool -K $i tso on
>>>>>>>>
>>>>>>>>         done
>>>>>>>>
>>>>>>>>
>>>>>>>> MEMLEAK - about 0-2MB for each 10 seconds
>>>>>>>> 0  MB/10sec
>>>>>>>> 1  MB/10sec
>>>>>>>> 0  MB/10sec
>>>>>>>> 2  MB/10sec
>>>>>>>> 1  MB/10sec
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> So best settings - to have as less leak as possible for now (rx-usecs
>>>>>>> set to 16):
>>>>>>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2
>>>>>>> enp3s0f3'
>>>>>>> for i in $ifc
>>>>>>>         do
>>>>>>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 16
>>>>>>> tx-usecs 768
>>>>>>>         ethtool -K $i gro on
>>>>>>>         ethtool -K $i tso on
>>>>>>>
>>>>>>>         done
>>>>>>>
>>>>>>>
>>>>>>> MEMLEAK: (0-1MB/10seconds)
>>>>>>> 0  MB/10sec
>>>>>>> 0  MB/10sec
>>>>>>> 0  MB/10sec
>>>>>>> 1  MB/10sec
>>>>>>> 1  MB/10sec
>>>>>>> -1  MB/10sec
>>>>>>> 1  MB/10sec
>>>>>>> 1  MB/10sec
>>>>>>> 0  MB/10sec
>>>>>>>
>>>>>>> (there are some memory recycles - so this is good :) )
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Compared to(rx-usecs 512):
>>>>>>>
>>>>>>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2
>>>>>>> enp3s0f3'
>>>>>>> for i in $ifc
>>>>>>>         do
>>>>>>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 512
>>>>>>> tx-usecs 128
>>>>>>>         ethtool -K $i gro on
>>>>>>>         ethtool -K $i tso on
>>>>>>>
>>>>>>>         done
>>>>>>>
>>>>>>> Server is leaking about 4-6MB per each 10 seconds
>>>>>>> MEMLEAK:
>>>>>>> 5  MB/10sec
>>>>>>> 6  MB/10sec
>>>>>>> 4  MB/10sec
>>>>>>> 4  MB/10sec
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> And  graph where all changes for rx-usecs was done over some time:
>>>>>> https://ibb.co/nrRfbR
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> Cant eliminate the problem with settings - memleak is bigger or less
>>>>> visible with rx-usecs set to low values - but then have 100% cpu load - cant
>>>>> have rx-usecs set to 16
>>>>>
>>>>> Cant find also other host with same cards or that are using i40e driver
>>>>> for tests with bisecting
>>>>> So will just replace to mellanox :)
>>>>>
>>>>>
>>>> Also after fresh reboot with i40e
>>>> startup settings:
>>>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2
>>>> enp3s0f3'
>>>> for i in $ifc
>>>>         do
>>>>         ip link set up dev $i
>>>>         ethtool -A $i autoneg off rx off tx off
>>>>         ethtool -G $i rx 2048 tx 2048
>>>>         ip link set $i txqueuelen 1000
>>>>         #ethtool -C $i rx-usecs 256
>>>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 17
>>>> tx-usecs 125
>>>>         ethtool -L $i combined 6
>>>>         #ethtool -N $i rx-flow-hash udp4 sdfn
>>>>         #ethtool -K $i ntuple on
>>>>         #ethtool -K $i gro off
>>>>         #ethtool -K $i tso off
>>>>         done
>>>>
>>>>
>>>> After issuing:
>>>>
>>>>  ethtool -K enp2s0f0 gro on tso on
>>>>
>>>> dmesg shows
>>>> [35764.338259] i40e 0000:02:00.0: PF reset failed, -15
>>>>
>>>>
>>>> and no traffic on the card :)
>>>>
>>>>
>>> Also checked now
>>> bigger rx ring
>>>         ethtool -G $i rx 2048 tx 2048
>>>
>>>
>>> Bigger memleag :)
>>>
>>>
>>>
>> ok need to change cards now to ixgbe .... no reply no help for i40e so
>> ....
>>
>> maybee someone else with i40e will gather more data i have only this host
>> soo far - will try to install this cards to other hosts after change but
>> alll this movement will takes about 2 maybee 3 months - nobody from my team
>> want to but now cards that supports i40e cause of this bug soo this is hard
>> now to debug - i need to change also all cards now >10G to mellanox that
>> have no such bug ... sorry :)
>>
>>
> Last tests from my side:)
> settings
> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2
> enp3s0f3'
> for i in $ifc
>         do
>         ip link set up dev $i
>         ethtool -A $i autoneg off rx off tx off
>         ethtool -G $i rx 2048 tx 2048
>         ip link set $i txqueuelen 1000
>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 17 tx-usecs
> 125
>         ethtool -L $i combined 6
>         ethtool -K $i ntuple on
>         ethtool -K $i gro on
>         ethtool -K $i tso on
>         done
>
> MEMLEAK 1-2MB/10secs
> 1  MB/10sec
> 2  MB/10sec
> 1  MB/10sec
> 2  MB/10sec
> 2  MB/10sec
> 2  MB/10sec
> 1  MB/10sec
> 2  MB/10sec
> 2  MB/10sec
> 2  MB/10sec
> 1  MB/10sec
> 2  MB/10sec
> 1  MB/10sec
> 1  MB/10sec
> 0  MB/10sec
> 2  MB/10sec
> 2  MB/10sec
> 0  MB/10sec
> 2  MB/10sec
> 5  MB/10sec
>
> Change rx-usecs 16 tx usecs 16
> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2
> enp3s0f3'
> for i in $ifc
>         do
>         ip link set up dev $i
>         ethtool -A $i autoneg off rx off tx off
>         ethtool -G $i rx 2048 tx 2048
>         ip link set $i txqueuelen 1000
>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 16 tx-usecs
> 16
>         ethtool -L $i combined 6
>         ethtool -K $i ntuple on
>         ethtool -K $i gro on
>         ethtool -K $i tso on
>         done
>
> MEMLEAK: 0-2MB/s with some recycles
> 0  MB/10sec
> 0  MB/10sec
> 0  MB/10sec
> 0  MB/10sec
> 0  MB/10sec
> 0  MB/10sec
> 1  MB/10sec
> 0  MB/10sec
> 2  MB/10sec
> 0  MB/10sec
> 2  MB/10sec
> -1  MB/10sec
> 0  MB/10sec
> 2  MB/10sec
> 0  MB/10sec
> 2  MB/10sec
> -1  MB/10sec
> 1  MB/10sec

This data doesn't tell me much of anything and isn't what I asked for.
I don't see how the interrupt throttling rate would be associated with
your memory leak other than possibly rate limiting it by rate limiting
the traffic itself. Is there something that gave you the impression
that interrupt rate was somehow involved?

When we last talked I had asked if you could do a git bisect to find
the memory leak and you said you would look into it. The most useful
way to solve this would be to do a git bisect between your current
kernel and the 4.11 kernel to find the point at which this started. If
we can do that then fixing this becomes much simpler as we just have
to fix the patch that introduced the issue.

Also, I don't know it is you are using to determine that there is a
memory leak. What tool is it you are using to do the tracking? Is
there any specific form of traffic that is causing the leak? If you
can't perform the bisection, any information you could provide that
would allow me to do it would also be useful.

Thanks.

- Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ