lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 8 Dec 2023 14:00:34 +0000
From:   Ryan Roberts <ryan.roberts@....com>
To:     Matthew Wilcox <willy@...radead.org>,
        Barry Song <21cnbao@...il.com>
Cc:     akpm@...ux-foundation.org, catalin.marinas@....com,
        david@...hat.com, linux-mm@...ck.org, steven.price@....com,
        will@...nel.org, linux-arm-kernel@...ts.infradead.org,
        mhocko@...e.com, shy828301@...il.com, v-songbaohua@...o.com,
        wangkefeng.wang@...wei.com, xiang@...nel.org, ying.huang@...el.com,
        yuzhao@...gle.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arm64: mm: support THP_SWAP on hardware with MTE

On 08/12/2023 13:22, Matthew Wilcox wrote:
> On Fri, Dec 08, 2023 at 08:34:01PM +1300, Barry Song wrote:
>> arch_prepare_to_swap() should take folio rather than page as parameter
>> because we support THP swap-out as a whole. It saves tags for all
>> pages in a large folio.
>>
>> Meanwhile, arch_swap_restore() now moves to use page parameter rather
>> than folio because swap-in, refault and MTE tags are working at the
>> granularity of base pages rather than folio:
>> 1. a large folio in swapcache can be partially unmapped, thus, MTE
>> tags for the unmapped pages will be invalidated;
>> 2. users might use mprotect() to set MTEs on a part of a large folio.
> 
> I would argue that using mprotect() to set MTEs on part of a large folio
> should cause that folio to be split.  Could the user give us any
> stronger signal that this memory is being used for different purposes,
> and therefore should not be managed as a single entity?

I agree this probably makes sense here. But splitting is best effort as I
understand it? It can fail due to long-term GUP, right? In which case we still
have to handle the MTE on partial large folio case safely, even if not performantly.

As an aside, I don't think it's clear cut that we would always prefer to split
based on user space mprotect/madvise/etc calls. IIUC, there are garbage
collectors that temporarily mark pages RO then switch back to RW. I wouldn't
want to split here and lose the benefits of contpte forever. I'm handwaving
because I haven't looked into the exact mechanisms yet. But I think we need to
understand these users better before deciding on an "always split based on user
hints" policy.

> 
>> Thus, it won't be easy to manage MTE tags at the granularity of folios
>> since we do have some cases in which a part of pages in a large folios
>> have valid tags, while the other part of pages haven't. Furthermore,
>> trying to restore MTE tags for a whole folio can lead to many loops and
>> early exiting even if the large folio in swapcache are still entirely
>> mapped since do_swap_page() only sets PTE and frees swap for the base
>> page where PF is happening.
>>
>> But we still have a chance to restore tags for a whole large folio
>> once we support swap-in large folio. So this job is deferred till we
>> can do refault and swap-in as a large folio.
> 
> I strongly disagree with changing the interface to arch_swap_restore()
> from folio to page.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ