lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <df6f3200-5989-c4b8-65ce-85ba82f4196f@nvidia.com>
Date:   Sat, 13 Aug 2022 18:10:20 -0700
From:   John Hubbard <jhubbard@...dia.com>
To:     Jason Gunthorpe <jgg@...dia.com>,
        Matthew Wilcox <willy@...radead.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Ingo Molnar <mingo@...nel.org>,
        David Hildenbrand <david@...hat.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        stable@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        Peter Xu <peterx@...hat.com>, Hugh Dickins <hughd@...gle.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mel Gorman <mgorman@...e.de>
Subject: Re: [PATCH] sched/all: Change BUG_ON() instances to WARN_ON()

On 8/11/22 16:22, Jason Gunthorpe wrote:
> On Thu, Aug 11, 2022 at 10:28:27PM +0100, Matthew Wilcox wrote:
>> On Thu, Aug 11, 2022 at 01:43:09PM -0700, Linus Torvalds wrote:
>>> May I suggest going one step further, and making these WARN_ON_ONCE() instead.
>>>
>>> >From personal experience, once some scheduler bug (or task struct
>>> corruption) happens, ti often *keeps* happening, and the logs just
>>> fill up with more and more data, to the point where you lose sight of
>>> the original report (and the machine can even get unusable just from
>>> the logging).
>>
>> I've been thinking about magically turning all the WARN_ON_ONCE() into
>> (effectively) WARN_ON_RATELIMIT().  I had some patches in that direction
>> a while ago but never got round to tidying them up for submission.

If you do that, I'd like to suggest that you avoid using magic here, but
instead just rename at the call sites.

Because:

First and foremost, something named WARN_ON_ONCE() clearly has a solemn
responsibility to warn exactly "once times"! :)

Second, it's not yet clear (or is it?) that WARN_ON_ONCE() is always
worse than rate limiting. It's a trade-off, rather than a clear win for
either case, in my experience. The _ONCE variant can get overwritten
if the kernel log wraps, but the _RATELIMIT on the other hand, may be
excessive.

And finally, if it *is* agreed on here that WARN_ON_RATELIMIT() is
always better than WARN_ON_ONCE(), then there is still no harm in
spending a patch or two (coccinelle...) to rename WARN_ON_ONCE() -->
WARN_ON_RATELIMIT(), so that we end up with accurate names.

> 
> I often wonder if we have a justification for WARN_ON to even exist, I
> see a lot of pressure to make things into WARN_ON_ONCE based on the
> logic that spamming makes it useless..

Agreed. WARN_ON_ONCE() or WARN_ON_RATELIMIT(), take your pick. But not
WARN_ON_EVERY_TIME()--that usually causes a serious problems in the
logs.


thanks,
-- 
John Hubbard
NVIDIA

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ