[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190320152516.3gbmaj5xoyxkivyt@pathway.suse.cz>
Date: Wed, 20 Mar 2019 16:25:16 +0100
From: Petr Mladek <pmladek@...e.com>
To: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
John Ogness <john.ogness@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, Michal Hocko <mhocko@...nel.org>
Subject: Re: ratelimit API: was: [RFC PATCH] printk: Introduce "store now but
print later" prefix.
On Thu 2019-03-07 03:24:25, Tetsuo Handa wrote:
> On 2019/03/06 19:04, Petr Mladek wrote:
> > I did not mean serializing. I meant to avoid printing the warnings
> > at all until OOM killer finishes its job.
>
> But your ratelimit_reset() below requires serializing.
>
> >
> >
> >> Also, both nopage_rs in warn_alloc() and oom_rs in oom_kill_process() are not
> >> working well. This is because ___ratelimit() function assumes that operations
> >> to be ratelimited complete fast enough to be able to repeat many times within
> >> a second. If one operation to be ratelimited takes many seconds (or even
> >> minutes), ___ratelimit() becomes always true and can not ratelimit at all.
> >
> > The current ratelimiting is time driven. We might need an event
> > driven variant. It might even be done with the current
> > implementation if we add something like:
> >
> > void ratelimit_reset(struct ratelimit_state *rs)
> > {
> > unsigned long flags;
> >
> > raw_spin_irqsave(&rs->lock, flags);
> >
> > rs->begin = jiffies;
> > rs->printed = 0;
> >
> > raw_spin_unlock_irqrestore(&rs->lock, flags);
> > }
> >
> > We could call this when some event "solved" the problem.
>
> This requires serialization among threads using "rs". I already
> proposed ratelimit_reset() for memcg's OOM problem at
> https://lkml.kernel.org/r/201810180246.w9I2koi3011358@www262.sakura.ne.jp
> but it was not accepted.
IMHO, the main problem was that the patch tried to work around
the ratelimit API weakness by a custom code.
I believe that using an improved/extended ratelimit API with
a sane semantic would be more acceptable.
> > It means that it makes sense to enable the related
> > ratelimited messages again because they would describe
> > another problem.
>
> ___ratelimit() could also check number of not-yet-flushed
> printk() records (e.g. log_next_seq - console_seq <= $some_threshold).
The number is almost useless without more information, for example,
how fast the consoles are, how many lines will get filtered
by a console_loglevel, if the console_sem owner is sleeping,
how many messages are being added by other CPUs.
I believe that we do not really need it. The ratelimit_reset()
user should know when the messages can get skipped because
they describe the same situation again and again.
Best Regards,
Petr
Powered by blists - more mailing lists