linux-kernel - Re: [PATCH v1] mm/gup: remove (VM_)BUG

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8ce6c104-d2fb-4fed-a108-224a6113c227@nvidia.com>
Date: Sat, 7 Jun 2025 11:00:50 -0700
From: John Hubbard <jhubbard@...dia.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Jason Gunthorpe <jgg@...pe.ca>
Cc: David Hildenbrand <david@...hat.com>, Michal Hocko <mhocko@...e.com>,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 Andrew Morton <akpm@...ux-foundation.org>,
 "Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
 <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
 Suren Baghdasaryan <surenb@...gle.com>, Peter Xu <peterx@...hat.com>
Subject: Re: [PATCH v1] mm/gup: remove (VM_)BUG_ONs

On 6/7/25 6:53 AM, Lorenzo Stoakes wrote:
> On Sat, Jun 07, 2025 at 10:42:14AM -0300, Jason Gunthorpe wrote:
>> On Fri, Jun 06, 2025 at 08:03:15PM +0100, Lorenzo Stoakes wrote:
>>> On Fri, Jun 06, 2025 at 07:46:52PM +0100, Lorenzo Stoakes wrote:
>>>> On Fri, Jun 06, 2025 at 03:42:12PM -0300, Jason Gunthorpe wrote:
>>>>> On Fri, Jun 06, 2025 at 08:23:25PM +0200, David Hildenbrand wrote:
...
>>> And I guess we'd not want a new interface for this like WARN_ON_ONCE_STORED()
>>> because that'd be... weird and how would anyone think to use that and nearly all
>>> cases wouldn't.
>>
>> No! Delete WARN_ON_ONCE and say the new global mechanism is good
>> enough to capture the first WARN_ON, everyone always uses it always
>> and then nobody needs to think about this anymore when writing new
>> code.
Interesting. This would cause WARN_ON*() to act similarly to lockdep, in
that the first instance is the only one that works.

That works, but if-and-only-if the system is kept completely WARN-free.
However, I have a mitigation for that, below.

> 
> Well that is simpler :)
> 
> I have encountered situations where I've had more than one and needed 2nd+
> but it is rare as you say.
> 
> My late night incoherent babbling yesterday was perhaps because I
> misunderstood David/John as to what they encountered in the past... maybe
> they can clarify...

I've debugged lots of production systems, often these were large HPC 
clusters and supercomputers. I've seen:

a) Long up-times, with (of course!) relatively small dmesg buffer sizes,
so that early logs are long gone. This means that WARN_ON_ONCE() is
quite often gone (overwritten). This is common.

The worst part is that if you go to reproduce a problem, you don't
see the next warning in the logs!! This is devastating, especially if
the site makes it hard to ask for a system reboot. (If you have
~20,000 nodes in the cluster, a reboot is not a small affair.)

b) WARN_ON() loops that fill up the dmesg buffer immediately, which
of course also overwrites anything leading up to the first WARN_ON.

However, (b) is rare on production kernels--it's more commonly
self-inflicted, if I make a mistake and write WARN_ON() when I should
have written WARN_ON_ONCE(), in a custom build to explore a problem.

c) Other printk-based noise in a loop, occasionally this also wraps
the dmesg buffer. Especially if a customer has done a bit of their
own "special" logging code.

> 
> I do find myself grepping dmesg to find the first warning and it's
> _annoying_ to do so, so this would be handy.
> 
> But I'm not sure it'd justify getting rid of WARN_ON_ONCE() when you are in
> a loop or something and now your dmesg is going to go to hell, still useful
> not to do that, esp. if you know there's no value to further warnings

In all cases, the ideal outcome is a dmesg buffer that includes (let's
fantasize for a moment):

1) the earliest dmesg output (showing any oddness with system
configuration, and what hardware was brought online, etc), plus

2) the first problem that is logged, plus

3) ...sometimes two or even three things happen in order to get to the
problem, so (again, fantasizing!) really the first *three* warnings
would be good to have.

So this is just a 3-element array of global warnings--not another
circular buffer, but a little larger than a TAINT or lockdep type
of capture.

thanks,
-- 
John Hubbard