[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a2f17d32-b997-0107-008a-f9679af38aba@gmail.com>
Date: Wed, 13 Dec 2017 13:05:49 -0800
From: John Fastabend <john.fastabend@...il.com>
To: Paweł Staszewski <pstaszewski@...are.pl>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: Huge memory leak with 4.15.0-rc2+
On 12/12/2017 09:57 AM, Paweł Staszewski wrote:
>
>
> W dniu 2017-12-11 o 23:27, Paweł Staszewski pisze:
>>
>>
>> W dniu 2017-12-11 o 23:15, John Fastabend pisze:
>>> On 12/11/2017 01:48 PM, Paweł Staszewski wrote:
>>>>
>>>> W dniu 2017-12-11 o 22:23, Paweł Staszewski pisze:
>>>>> Hi
>>>>>
>>>>>
>>>>> I just upgraded some testing host to 4.15.0-rc2+ kernel
>>>>>
>>>>> And after some time of traffic processing - when traffic on all ports
>>>>> reach about 3Mpps - memleak started.
>>>>>
>>>
>>> [...]
>>>
>>>>> Some observations - when i disable tso on all cards there is more
>>>>> memleak.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> When traffic starts to drop - there is less and less memleak
>>>> below link to memory usage graph:
>>>> https://ibb.co/hU97kG
>>>>
>>>> And there is rising slab_unrecl - Amount of unreclaimable memory used
>>>> for slab kernel allocations
>>>>
>>>>
>>>> Forgot to add that im using hfsc and qdiscs like pfifo on classes.
>>>>
>>>>
>>> Maybe some error case I missed in the qdisc patches I'm looking into
>>> it.
>>>
>>> Thanks,
>>> John
>>>
>>>
>> This is how it looks like when corelated on graph - traffic vs mem
>> https://ibb.co/njpkqG
>>
>> Typical hfsc class + qdisc:
>> ### Client interface vlan1616
>> tc qdisc del dev vlan1616 root
>> tc qdisc add dev vlan1616 handle 1: root hfsc default 100
>> tc class add dev vlan1616 parent 1: classid 1:100 hfsc ls m2 200Mbit ul m2 200Mbit
>> tc qdisc add dev vlan1616 parent 1:100 handle 100: pfifo limit 128
>> ### End TM for client interface
>> tc qdisc del dev vlan1616 ingress
>> tc qdisc add dev vlan1616 handle ffff: ingress
>> tc filter add dev vlan1616 parent ffff: protocol ip prio 50 u32 match ip src 0.0.0.0/0 police rate 200Mbit burst 200M mtu 32k drop flowid 1:1
>>
>> And this is same for about 450 vlan interfaces
>>
>>
>> Good thing is that compared to 4.14.3 i have about 5% less cpu load on 4.15.0-rc2+
>>
>> When hfsc will be lockless or tbf - then it will be really huge difference in cpu load on x86 when using traffic shaping - so really good job John.
>>
>>
>>
>>
>
> Yestarday changed kernel from
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
>
> to
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?h=v4.15-rc3
>
>
> And there is no memleak.
> So yes probabbly lockless qdisc patches
>
It seems I was able to produce a similar memleak with qdisc patches
reverted and running TCP traffic overnight. I guess we can do a bisect
and track it down. Will try to get a "good" run tonight.
Thanks,
John
Powered by blists - more mailing lists