[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8eb52527-b159-4449-9bb7-34b9cfac05a6@redhat.com>
Date: Fri, 24 Jan 2025 10:05:20 +0100
From: David Hildenbrand <david@...hat.com>
To: Andrew Morton <akpm@...ux-foundation.org>,
Liu Shixin <liushixin2@...wei.com>
Cc: Kefeng Wang <wangkefeng.wang@...wei.com>,
Kemeng Shi <shikemeng@...weicloud.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
Mel Gorman <mgorman@...hsingularity.net>,
Matthew Wilcox <willy@...radead.org>, Nanyong Sun <sunnanyong@...wei.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/compaction: fix UBSAN shift-out-of-bounds warning
On 24.01.25 07:20, Andrew Morton wrote:
> On Thu, 23 Jan 2025 10:10:29 +0800 Liu Shixin <liushixin2@...wei.com> wrote:
>
>> syzkaller reported a UBSAN shift-out-of-bounds warning of (1UL << order)
>
> A Link: to the syzcaller report would be great, please.
>
Hi,
>> in isolate_freepages_block(). The bogus compound_order can be any value
>> because it is union with flags. Add back the MAX_PAGE_ORDER check to fix
>> the warning.
>
> OK, I'd never noticed compound_order()'s restrictions before. It looks
> like a crazy thing - what use is it if it can return "wild return
> values"?
It's perfectly fine to call if we hold a folio reference. There is some code that
wants to avoid that, so we might see the page concurrently get freed and
a compound page dissolved.
Similar to doing a folio_test_large() or folio_nr_pages() etc. without a
folio reference.
>
> Can someone please explain what's going on here and suggest what we can
> do about it?
Note the comment:
/*
* For compound pages such as THP and hugetlbfs, we can save
* potentially a lot of iterations if we skip them at once.
* The check is racy, but we can consider only valid values
* and the only danger is skipping too much.
*/
So there is not really anything going wrong. Racy access can result in
skipping too much.
The UBSAN warning is not anything critical.
>
> For example, should we have a compound_order_not_wild() which is called
> with refcounted pages and which cannot return "wild" numbers? Or
> something else.
We only have a handful of these racy usages.
Observe another one with a similar MAX_PAGE_ORDER check:
/*
* skip hugetlbfs if we are not compacting for pages
* bigger than its order. THPs and other compound pages
* are handled below.
*/
if (!cc->alloc_contig) {
const unsigned int order = compound_order(page);
if (order <= MAX_PAGE_ORDER) {
low_pfn += (1UL << order) - 1;
nr_scanned += (1UL << order) - 1;
}
goto isolate_fail;
}
So the common patter is on this racy access in code that operates on pageblock granularity
is to check against MAX_PAGE_ORDER before advancing based on racily obtained value.
Note that we do have folios that have order > MAX_PAGE_ORDER,
it's rather that *this code* only works in pageblock chunks, so <= MAX_PAGE_ORDER is
all it needs to advance.
So having a compact_racy_compound_order() in this file and replacing all instances
here where we sanitize against MAX_PAGE_ORDER might be reasonable, because page
compaction works on pageblocks (which are <= MAX_PAGE_ORDER).
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists