linux-kernel - Re: [GIT PULL] Memory folios for v5.15

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ab06fae3-3239-ee70-87b8-7ec380e47920@redhat.com>
Date:   Tue, 5 Oct 2021 19:32:59 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Johannes Weiner <hannes@...xchg.org>,
        Matthew Wilcox <willy@...radead.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [GIT PULL] Memory folios for v5.15

On 05.10.21 19:29, Johannes Weiner wrote:
> On Tue, Oct 05, 2021 at 02:52:01PM +0100, Matthew Wilcox wrote:
>> On Mon, Aug 23, 2021 at 05:26:41PM -0400, Johannes Weiner wrote:
>>> One one hand, the ambition appears to substitute folio for everything
>>> that could be a base page or a compound page even inside core MM
>>> code. Since there are very few places in the MM code that expressly
>>> deal with tail pages in the first place, this amounts to a conversion
>>> of most MM code - including the LRU management, reclaim, rmap,
>>> migrate, swap, page fault code etc. - away from "the page".
>>>
>>> However, this far exceeds the goal of a better mm-fs interface. And
>>> the value proposition of a full MM-internal conversion, including
>>> e.g. the less exposed anon page handling, is much more nebulous. It's
>>> been proposed to leave anon pages out, but IMO to keep that direction
>>> maintainable, the folio would have to be translated to a page quite
>>> early when entering MM code, rather than propagating it inward, in
>>> order to avoid huge, massively overlapping page and folio APIs.
>>
>> Here's an example where our current confusion between "any page"
>> and "head page" at least produces confusing behaviour, if not an
>> outright bug, isolate_migratepages_block():
>>
>>                  page = pfn_to_page(low_pfn);
>> ...
>>                  if (PageCompound(page) && !cc->alloc_contig) {
>>                          const unsigned int order = compound_order(page);
>>
>>                          if (likely(order < MAX_ORDER))
>>                                  low_pfn += (1UL << order) - 1;
>>                          goto isolate_fail;
>>                  }
>>
>> compound_order() does not expect a tail page; it returns 0 unless it's
>> a head page.  I think what we actually want to do here is:
>>
>> 		if (!cc->alloc_contig) {
>> 			struct page *head = compound_head(page);
>> 			if (PageHead(head)) {
>> 				const unsigned int order = compound_order(head);
>>
>> 				low_pfn |= (1UL << order) - 1;
>> 				goto isolate_fail;
>> 			}
>> 		}
>>
>> Not earth-shattering; not even necessarily a bug.  But it's an example
>> of the way the code reads is different from how the code is executed,
>> and that's potentially dangerous.  Having a different type for tail
>> and not-tail pages prevents the muddy thinking that can lead to
>> tail pages being passed to compound_order().
> 
> Thanks for digging this up. I agree the second version is much better.
> 
> My question is still whether the extensive folio whitelisting of
> everybody else is the best way to bring those codepaths to light.
> 
> The above isn't totally random. That code is a pfn walker which
> translates from the basepage address space to an ambiguous struct page
> object. There are more of those, but we can easily identify them: all
> uses of pfn_to_page() and virt_to_page() indicate that the code needs
> an audit for how exactly they're using the returned page.

+pfn_to_online_page()


-- 
Thanks,

David / dhildenb