[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190904120707.GU3838@dhcp22.suse.cz>
Date: Wed, 4 Sep 2019 14:07:07 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Qian Cai <cai@....pw>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
Eric Dumazet <eric.dumazet@...il.com>, davem@...emloft.net,
netdev@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Petr Mladek <pmladek@...e.com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH] net/skbuff: silence warnings under memory pressure
On Wed 04-09-19 07:59:17, Qian Cai wrote:
> On Wed, 2019-09-04 at 10:25 +0200, Michal Hocko wrote:
> > On Wed 04-09-19 16:00:42, Sergey Senozhatsky wrote:
> > > On (09/04/19 15:41), Sergey Senozhatsky wrote:
> > > > But the thing is different in case of dump_stack() + show_mem() +
> > > > some other output. Because now we ratelimit not a single printk() line,
> > > > but hundreds of them. The ratelimit becomes - 10 * $$$ lines in 5 seconds
> > > > (IOW, now we talk about thousands of lines).
> > >
> > > And on devices with slow serial consoles this can be somewhat close to
> > > "no ratelimit". *Suppose* that warn_alloc() adds 700 lines each time.
> > > Within 5 seconds we can call warn_alloc() 10 times, which will add 7000
> > > lines to the logbuf. If printk() can evict only 6000 lines in 5 seconds
> > > then we have a growing number of pending logbuf messages.
> >
> > Yes, ratelimit is problematic when the ratelimited operation is slow. I
> > guess that is a well known problem and we would need to rework both the
> > api and the implementation to make it work in those cases as well.
> > Essentially we need to make the ratelimit act as a gatekeeper to an
> > operation section - something like a critical section except you can
> > tolerate more code executions but not too many. So effectively
> >
> > start_throttle(rate, number);
> > /* here goes your operation */
> > end_throttle();
> >
> > one operation is not considered done until the whole section ends.
> > Or something along those lines.
> >
> > In this particular case we can increase the rate limit parameters of
> > course but I think that longterm we need a better api.
>
> The problem is when a system is under heavy memory pressure, everything is
> becoming slower, so I don't know how to come up with a sane default for rate
> limit parameters as a generic solution that would work for every machine out
> there. Sure, it is possible to set a limit as low as possible that would work
> for the majority of systems apart from people may complain that they are now
> missing important warnings, but using __GFP_NOWARN in this code would work for
> all systems. You could even argument there is even a separate benefit that it
> could reduce the noise-level overall from those build_skb() allocation failures
> as it has a fall-back mechanism anyway.
As Vlastimil already pointed out, __GFP_NOWARN would hide that reserves
might be configured too low.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists