[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ab06fae3-3239-ee70-87b8-7ec380e47920@redhat.com>
Date: Tue, 5 Oct 2021 19:32:59 +0200
From: David Hildenbrand <david@...hat.com>
To: Johannes Weiner <hannes@...xchg.org>,
Matthew Wilcox <willy@...radead.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>, linux-mm@...ck.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [GIT PULL] Memory folios for v5.15
On 05.10.21 19:29, Johannes Weiner wrote:
> On Tue, Oct 05, 2021 at 02:52:01PM +0100, Matthew Wilcox wrote:
>> On Mon, Aug 23, 2021 at 05:26:41PM -0400, Johannes Weiner wrote:
>>> One one hand, the ambition appears to substitute folio for everything
>>> that could be a base page or a compound page even inside core MM
>>> code. Since there are very few places in the MM code that expressly
>>> deal with tail pages in the first place, this amounts to a conversion
>>> of most MM code - including the LRU management, reclaim, rmap,
>>> migrate, swap, page fault code etc. - away from "the page".
>>>
>>> However, this far exceeds the goal of a better mm-fs interface. And
>>> the value proposition of a full MM-internal conversion, including
>>> e.g. the less exposed anon page handling, is much more nebulous. It's
>>> been proposed to leave anon pages out, but IMO to keep that direction
>>> maintainable, the folio would have to be translated to a page quite
>>> early when entering MM code, rather than propagating it inward, in
>>> order to avoid huge, massively overlapping page and folio APIs.
>>
>> Here's an example where our current confusion between "any page"
>> and "head page" at least produces confusing behaviour, if not an
>> outright bug, isolate_migratepages_block():
>>
>> page = pfn_to_page(low_pfn);
>> ...
>> if (PageCompound(page) && !cc->alloc_contig) {
>> const unsigned int order = compound_order(page);
>>
>> if (likely(order < MAX_ORDER))
>> low_pfn += (1UL << order) - 1;
>> goto isolate_fail;
>> }
>>
>> compound_order() does not expect a tail page; it returns 0 unless it's
>> a head page. I think what we actually want to do here is:
>>
>> if (!cc->alloc_contig) {
>> struct page *head = compound_head(page);
>> if (PageHead(head)) {
>> const unsigned int order = compound_order(head);
>>
>> low_pfn |= (1UL << order) - 1;
>> goto isolate_fail;
>> }
>> }
>>
>> Not earth-shattering; not even necessarily a bug. But it's an example
>> of the way the code reads is different from how the code is executed,
>> and that's potentially dangerous. Having a different type for tail
>> and not-tail pages prevents the muddy thinking that can lead to
>> tail pages being passed to compound_order().
>
> Thanks for digging this up. I agree the second version is much better.
>
> My question is still whether the extensive folio whitelisting of
> everybody else is the best way to bring those codepaths to light.
>
> The above isn't totally random. That code is a pfn walker which
> translates from the basepage address space to an ambiguous struct page
> object. There are more of those, but we can easily identify them: all
> uses of pfn_to_page() and virt_to_page() indicate that the code needs
> an audit for how exactly they're using the returned page.
+pfn_to_online_page()
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists