[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ee0723b0-1a4e-eef3-8833-c2eb034e5d08@suse.cz>
Date: Mon, 2 Sep 2019 16:24:45 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Qian Cai <cai@....pw>, Eric Dumazet <eric.dumazet@...il.com>,
davem@...emloft.net
Cc: netdev@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Michal Hocko <mhocko@...nel.org>
Subject: Re: [PATCH] net/skbuff: silence warnings under memory pressure
On 8/30/19 5:25 PM, Qian Cai wrote:
> On Fri, 2019-08-30 at 17:11 +0200, Eric Dumazet wrote:
>>
>> On 8/30/19 4:57 PM, Qian Cai wrote:
>>> When running heavy memory pressure workloads, the system is throwing
>>> endless warnings below due to the allocation could fail from
>>> __build_skb(), and the volume of this call could be huge which may
>>> generate a lot of serial console output and cosumes all CPUs as
>>> warn_alloc() could be expensive by calling dump_stack() and then
>>> show_mem().
>>>
>>> Fix it by silencing the warning in this call site. Also, it seems
>>> unnecessary to even print a warning at all if the allocation failed in
>>> __build_skb(), as it may just retransmit the packet and retry.
>>>
Well, __GFP_NOWARN would save me from explaining this warning to users
many times. OTOH usually it's an indication that min_free_kbytes should
be raised to better cope with network traffic.
>>
>> Same patches are showing up there and there from time to time.
>>
>> Why is this particular spot interesting, against all others not adding
>> __GFP_NOWARN ?
This one is interesting that it's a GFP_ATOMIC allocation triggered by
incoming packets, and has a fallback mechanism. I don't recall other so
notoric ones.
>> Are we going to have hundred of patches adding __GFP_NOWARN at various points,
>> or should we get something generic to not flood the syslog in case of memory
>> pressure ?
>>
>
> From my testing which uses LTP oom* tests. There are only 3 places need to be
> patched. The other two are in IOMMU code for both Intel and AMD. The place is
> particular interesting because it could cause the system with floating serial
> console output for days without making progress in OOM. I suppose it ends up in
> a looping condition that warn_alloc() would end up generating more calls into
> __build_skb() via ksoftirqd.
Regardless of this particular allocation, if the reporting itself makes
the conditions so much worse, then at least some kind of general
ratelimit would make sense indeed.
Powered by blists - more mailing lists