lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CABk29NsPH-xpTNSB5CcLOHZ-TPVgFa3Dj0O=VO_OL9v+BGMh0Q@mail.gmail.com>
Date: Tue, 7 Jan 2025 12:45:40 -0800
From: Josh Don <joshdon@...gle.com>
To: David Rientjes <rientjes@...gle.com>
Cc: Madadi Vineeth Reddy <vineethr@...ux.ibm.com>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Valentin Schneider <vschneid@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [patch 2/2] sched/debug: Remove need_resched ratelimiting for warnings

On Tue, Jan 7, 2025 at 12:15 PM David Rientjes <rientjes@...gle.com> wrote:
>
> On Tue, 7 Jan 2025, Madadi Vineeth Reddy wrote:
>
> > Any idea why it was initially kept to one warning per hour?
> >
>
> Adding Josh Don who may have insight into this historically.

No idea on the hour default, unfortunately. Almost certainly arbitrary.

> > The possible reasons that come to mind are to prevent excessive logging under
> > high CPU contention, as well as to ensure that a warning logged once an hour
> > indicates the issue is not caused by a short workload spike. Additionally,
> > this rate limit might help avoid impacting system performance due to excessive
> > logging.
> >
> > However, if the default value of latency_warn_once is changed to disable it, it
> > may be acceptable to bypass the rate limit, as it would indicate a preference
> > for logging over performance.
> >
>
> Right, I think this should be entirely up to what the admin configures in
> debugfs.  If they elect to disable latency_warn_once, we'll simply emit
> the information as often as they specify in latency_warn_ms and not add
> our own ratelimiting on top.  If they have a preference for lots of
> logging, so be it, let's not hide that data.

Your change doesn't reset rq->last_seen_need_resched_ns, so now
without the ratelimit I think we'll get a dump every single tick until
we eventually reschedule.

Another potential benefit to the ratelimit is that if we have
something wedging multiple cpus concurrently, we don't spam the log
(if warn_once is disabled). Though, probably an unlikely occurrence.

I think if you modify the patch to reset last_seen_need_resched_ns
that'll give the behavior you're after.

Best,
Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ