lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5575b0cf-de59-4b4e-b339-c310f079bda7@redhat.com>
Date: Wed, 4 Jun 2025 16:58:25 +0200
From: David Hildenbrand <david@...hat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 Andrew Morton <akpm@...ux-foundation.org>,
 "Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
 <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
 Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
 Jason Gunthorpe <jgg@...pe.ca>, John Hubbard <jhubbard@...dia.com>,
 Peter Xu <peterx@...hat.com>, Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v1] mm/gup: remove (VM_)BUG_ONs

On 04.06.25 16:48, Lorenzo Stoakes wrote:
> +Linus in case he has an opinion about BUG_ON() in general...
> 
> On Wed, Jun 04, 2025 at 04:05:44PM +0200, David Hildenbrand wrote:
>> Especially once we hit one of the assertions in
>> sanity_check_pinned_pages(), observing follow-up assertions failing
>> in other code can give good clues about what went wrong, so use
>> VM_WARN_ON_ONCE instead.
> 
> I guess the situation where you'd actually want a BUG_ON() is one where
> carrying on might cause further corruption so you just want things to stop.

Yes. Like, serious data corruption would be avoidable.

> 
> But usually we're already pretty screwed if the thing happened right? So
> it's rare if ever that this would be legit?
> 
> Linus's point of view is that we shouldn't use them _at all_ right? So
> maybe even this situation isn't one where we'd want to use one?

I think the grey zone is actual data corruption. But one has to have a 
pretty good reason to use a BUG_ON and not a WARN_ON_ONCE() + recovery.

> 
>>
>> While at it, let's just convert all VM_BUG_ON to VM_WARN_ON_ONCE as
>> well. Add one comment for the pfn_valid() check.
> 
> Yeah VM_BUG_ON() is just _weird_. Maybe we should get rid of all of them
> full stop?

That's my thinking a well.

> 
>>
>> We have to introduce VM_WARN_ON_ONCE_VMA() to make that fly.
> 
> I checked the implementation vs. the other VM_WARN_ON_ONCE_*()'s and it
> looks good.
> 
> I wonder if we can find a way to not duplicate this code... but one for a
> follow up I think :>)
> 
>>
>> Drop the BUG_ON after mmap_read_lock_killable(), if that ever returns
>> something > 0 we're in bigger trouble. Convert the other BUG_ON's into
>> VM_WARN_ON_ONCE as well, they are in a similar domain "should never
>> happen", but more reasonable to check for during early testing.
>>
>> Cc: Andrew Morton <akpm@...ux-foundation.org>
>> Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
>> Cc: "Liam R. Howlett" <Liam.Howlett@...cle.com>
>> Cc: Vlastimil Babka <vbabka@...e.cz>
>> Cc: Mike Rapoport <rppt@...nel.org>
>> Cc: Suren Baghdasaryan <surenb@...gle.com>
>> Cc: Michal Hocko <mhocko@...e.com>
>> Cc: Jason Gunthorpe <jgg@...pe.ca>
>> Cc: John Hubbard <jhubbard@...dia.com>
>> Cc: Peter Xu <peterx@...hat.com>
>> Signed-off-by: David Hildenbrand <david@...hat.com>
> 
> LGTM so,
> 
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> 
> 

Thanks!

> One nit below.
> 
>> ---
>>
>> Wanted to do this for a long time, but my todo list keeps growing ...
> 
> Sounds familiar :) Merge window a chance to do some of these things...
> 
>>
>> Based on mm/mm-unstable
>>
>> ---
>>   include/linux/mmdebug.h | 12 ++++++++++++
>>   mm/gup.c                | 41 +++++++++++++++++++----------------------
>>   2 files changed, 31 insertions(+), 22 deletions(-)
>>
>> diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
>> index a0a3894900ed4..14a45979cccc9 100644
>> --- a/include/linux/mmdebug.h
>> +++ b/include/linux/mmdebug.h
>> @@ -89,6 +89,17 @@ void vma_iter_dump_tree(const struct vma_iterator *vmi);
>>   	}								\
>>   	unlikely(__ret_warn_once);					\
>>   })
>> +#define VM_WARN_ON_ONCE_VMA(cond, vma)		({			\
>> +	static bool __section(".data..once") __warned;			\
>> +	int __ret_warn_once = !!(cond);					\
>> +									\
>> +	if (unlikely(__ret_warn_once && !__warned)) {			\
>> +		dump_vma(vma);						\
>> +		__warned = true;					\
>> +		WARN_ON(1);						\
>> +	}								\
>> +	unlikely(__ret_warn_once);					\
>> +})
> 
> An aside, I wonder if we could somehow make this generic for various
> WARN_ON_ONCE()'s?

Yeah, probably. Maybe it will get .... ugly :)

> 
>>   #define VM_WARN_ON_VMG(cond, vmg)		({			\
>>   	int __ret_warn = !!(cond);					\
>>   									\
>> @@ -115,6 +126,7 @@ void vma_iter_dump_tree(const struct vma_iterator *vmi);
>>   #define VM_WARN_ON_FOLIO(cond, folio)  BUILD_BUG_ON_INVALID(cond)
>>   #define VM_WARN_ON_ONCE_FOLIO(cond, folio)  BUILD_BUG_ON_INVALID(cond)
>>   #define VM_WARN_ON_ONCE_MM(cond, mm)  BUILD_BUG_ON_INVALID(cond)
>> +#define VM_WARN_ON_ONCE_VMA(cond, vma)  BUILD_BUG_ON_INVALID(cond)
>>   #define VM_WARN_ON_VMG(cond, vmg)  BUILD_BUG_ON_INVALID(cond)
>>   #define VM_WARN_ONCE(cond, format...) BUILD_BUG_ON_INVALID(cond)
>>   #define VM_WARN(cond, format...) BUILD_BUG_ON_INVALID(cond)
>> diff --git a/mm/gup.c b/mm/gup.c
>> index e065a49842a87..3c3931fcdd820 100644
>> --- a/mm/gup.c
>> +++ b/mm/gup.c
>> @@ -64,11 +64,11 @@ static inline void sanity_check_pinned_pages(struct page **pages,
>>   		    !folio_test_anon(folio))
>>   			continue;
>>   		if (!folio_test_large(folio) || folio_test_hugetlb(folio))
>> -			VM_BUG_ON_PAGE(!PageAnonExclusive(&folio->page), page);
>> +			VM_WARN_ON_ONCE_PAGE(!PageAnonExclusive(&folio->page), page);
>>   		else
>>   			/* Either a PTE-mapped or a PMD-mapped THP. */
>> -			VM_BUG_ON_PAGE(!PageAnonExclusive(&folio->page) &&
>> -				       !PageAnonExclusive(page), page);
>> +			VM_WARN_ON_ONCE_PAGE(!PageAnonExclusive(&folio->page) &&
>> +					     !PageAnonExclusive(page), page);
> 
> Nit but wouldn't VM_WARN_ON_ONCE_FOLIO() work better here?

No, we want the actual problematic page here, as that can give us clues 
what is going wrong.

For the small-folio case above we could use it, though.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ