[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <229ebc3b-1c7e-474f-36f9-0fa603b889fb@gmail.com>
Date: Fri, 30 Aug 2019 18:15:22 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Qian Cai <cai@....pw>, Eric Dumazet <eric.dumazet@...il.com>,
davem@...emloft.net
Cc: netdev@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net/skbuff: silence warnings under memory pressure
On 8/30/19 5:25 PM, Qian Cai wrote:
> On Fri, 2019-08-30 at 17:11 +0200, Eric Dumazet wrote:
>>
>> On 8/30/19 4:57 PM, Qian Cai wrote:
>>> When running heavy memory pressure workloads, the system is throwing
>>> endless warnings below due to the allocation could fail from
>>> __build_skb(), and the volume of this call could be huge which may
>>> generate a lot of serial console output and cosumes all CPUs as
>>> warn_alloc() could be expensive by calling dump_stack() and then
>>> show_mem().
>>>
>>> Fix it by silencing the warning in this call site. Also, it seems
>>> unnecessary to even print a warning at all if the allocation failed in
>>> __build_skb(), as it may just retransmit the packet and retry.
>>>
>>
>> Same patches are showing up there and there from time to time.
>>
>> Why is this particular spot interesting, against all others not adding
>> __GFP_NOWARN ?
>>
>> Are we going to have hundred of patches adding __GFP_NOWARN at various points,
>> or should we get something generic to not flood the syslog in case of memory
>> pressure ?
>>
>
> From my testing which uses LTP oom* tests. There are only 3 places need to be
> patched. The other two are in IOMMU code for both Intel and AMD. The place is
> particular interesting because it could cause the system with floating serial
> console output for days without making progress in OOM. I suppose it ends up in
> a looping condition that warn_alloc() would end up generating more calls into
> __build_skb() via ksoftirqd.
>
Yes, but what about other tests done by other people ?
You do not really answer my last question, which was really the point I tried
to make.
If there is a risk of flooding the syslog, we should fix this generically
in mm layer, not adding hundred of __GFP_NOWARN all over the places.
Maybe just make __GFP_NOWARN the default, I dunno.
Powered by blists - more mailing lists