[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5575b0cf-de59-4b4e-b339-c310f079bda7@redhat.com>
Date: Wed, 4 Jun 2025 16:58:25 +0200
From: David Hildenbrand <david@...hat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Andrew Morton <akpm@...ux-foundation.org>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
<vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Jason Gunthorpe <jgg@...pe.ca>, John Hubbard <jhubbard@...dia.com>,
Peter Xu <peterx@...hat.com>, Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v1] mm/gup: remove (VM_)BUG_ONs
On 04.06.25 16:48, Lorenzo Stoakes wrote:
> +Linus in case he has an opinion about BUG_ON() in general...
>
> On Wed, Jun 04, 2025 at 04:05:44PM +0200, David Hildenbrand wrote:
>> Especially once we hit one of the assertions in
>> sanity_check_pinned_pages(), observing follow-up assertions failing
>> in other code can give good clues about what went wrong, so use
>> VM_WARN_ON_ONCE instead.
>
> I guess the situation where you'd actually want a BUG_ON() is one where
> carrying on might cause further corruption so you just want things to stop.
Yes. Like, serious data corruption would be avoidable.
>
> But usually we're already pretty screwed if the thing happened right? So
> it's rare if ever that this would be legit?
>
> Linus's point of view is that we shouldn't use them _at all_ right? So
> maybe even this situation isn't one where we'd want to use one?
I think the grey zone is actual data corruption. But one has to have a
pretty good reason to use a BUG_ON and not a WARN_ON_ONCE() + recovery.
>
>>
>> While at it, let's just convert all VM_BUG_ON to VM_WARN_ON_ONCE as
>> well. Add one comment for the pfn_valid() check.
>
> Yeah VM_BUG_ON() is just _weird_. Maybe we should get rid of all of them
> full stop?
That's my thinking a well.
>
>>
>> We have to introduce VM_WARN_ON_ONCE_VMA() to make that fly.
>
> I checked the implementation vs. the other VM_WARN_ON_ONCE_*()'s and it
> looks good.
>
> I wonder if we can find a way to not duplicate this code... but one for a
> follow up I think :>)
>
>>
>> Drop the BUG_ON after mmap_read_lock_killable(), if that ever returns
>> something > 0 we're in bigger trouble. Convert the other BUG_ON's into
>> VM_WARN_ON_ONCE as well, they are in a similar domain "should never
>> happen", but more reasonable to check for during early testing.
>>
>> Cc: Andrew Morton <akpm@...ux-foundation.org>
>> Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
>> Cc: "Liam R. Howlett" <Liam.Howlett@...cle.com>
>> Cc: Vlastimil Babka <vbabka@...e.cz>
>> Cc: Mike Rapoport <rppt@...nel.org>
>> Cc: Suren Baghdasaryan <surenb@...gle.com>
>> Cc: Michal Hocko <mhocko@...e.com>
>> Cc: Jason Gunthorpe <jgg@...pe.ca>
>> Cc: John Hubbard <jhubbard@...dia.com>
>> Cc: Peter Xu <peterx@...hat.com>
>> Signed-off-by: David Hildenbrand <david@...hat.com>
>
> LGTM so,
>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
>
>
Thanks!
> One nit below.
>
>> ---
>>
>> Wanted to do this for a long time, but my todo list keeps growing ...
>
> Sounds familiar :) Merge window a chance to do some of these things...
>
>>
>> Based on mm/mm-unstable
>>
>> ---
>> include/linux/mmdebug.h | 12 ++++++++++++
>> mm/gup.c | 41 +++++++++++++++++++----------------------
>> 2 files changed, 31 insertions(+), 22 deletions(-)
>>
>> diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
>> index a0a3894900ed4..14a45979cccc9 100644
>> --- a/include/linux/mmdebug.h
>> +++ b/include/linux/mmdebug.h
>> @@ -89,6 +89,17 @@ void vma_iter_dump_tree(const struct vma_iterator *vmi);
>> } \
>> unlikely(__ret_warn_once); \
>> })
>> +#define VM_WARN_ON_ONCE_VMA(cond, vma) ({ \
>> + static bool __section(".data..once") __warned; \
>> + int __ret_warn_once = !!(cond); \
>> + \
>> + if (unlikely(__ret_warn_once && !__warned)) { \
>> + dump_vma(vma); \
>> + __warned = true; \
>> + WARN_ON(1); \
>> + } \
>> + unlikely(__ret_warn_once); \
>> +})
>
> An aside, I wonder if we could somehow make this generic for various
> WARN_ON_ONCE()'s?
Yeah, probably. Maybe it will get .... ugly :)
>
>> #define VM_WARN_ON_VMG(cond, vmg) ({ \
>> int __ret_warn = !!(cond); \
>> \
>> @@ -115,6 +126,7 @@ void vma_iter_dump_tree(const struct vma_iterator *vmi);
>> #define VM_WARN_ON_FOLIO(cond, folio) BUILD_BUG_ON_INVALID(cond)
>> #define VM_WARN_ON_ONCE_FOLIO(cond, folio) BUILD_BUG_ON_INVALID(cond)
>> #define VM_WARN_ON_ONCE_MM(cond, mm) BUILD_BUG_ON_INVALID(cond)
>> +#define VM_WARN_ON_ONCE_VMA(cond, vma) BUILD_BUG_ON_INVALID(cond)
>> #define VM_WARN_ON_VMG(cond, vmg) BUILD_BUG_ON_INVALID(cond)
>> #define VM_WARN_ONCE(cond, format...) BUILD_BUG_ON_INVALID(cond)
>> #define VM_WARN(cond, format...) BUILD_BUG_ON_INVALID(cond)
>> diff --git a/mm/gup.c b/mm/gup.c
>> index e065a49842a87..3c3931fcdd820 100644
>> --- a/mm/gup.c
>> +++ b/mm/gup.c
>> @@ -64,11 +64,11 @@ static inline void sanity_check_pinned_pages(struct page **pages,
>> !folio_test_anon(folio))
>> continue;
>> if (!folio_test_large(folio) || folio_test_hugetlb(folio))
>> - VM_BUG_ON_PAGE(!PageAnonExclusive(&folio->page), page);
>> + VM_WARN_ON_ONCE_PAGE(!PageAnonExclusive(&folio->page), page);
>> else
>> /* Either a PTE-mapped or a PMD-mapped THP. */
>> - VM_BUG_ON_PAGE(!PageAnonExclusive(&folio->page) &&
>> - !PageAnonExclusive(page), page);
>> + VM_WARN_ON_ONCE_PAGE(!PageAnonExclusive(&folio->page) &&
>> + !PageAnonExclusive(page), page);
>
> Nit but wouldn't VM_WARN_ON_ONCE_FOLIO() work better here?
No, we want the actual problematic page here, as that can give us clues
what is going wrong.
For the small-folio case above we could use it, though.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists