linux-kernel - Re: [PATCH v5 00/38] New page table range API

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <56ca93af-67dc-9d10-d27e-00c8d7c20f1b@linux.ibm.com>
Date:   Thu, 13 Jul 2023 12:42:44 +0200
From:   Christian Borntraeger <borntraeger@...ux.ibm.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Claudio Imbrenda <imbrenda@...ux.ibm.com>,
        linux-arch@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
        linux-s390 <linux-s390@...r.kernel.org>
Subject: Re: [PATCH v5 00/38] New page table range API



Am 11.07.23 um 14:36 schrieb Matthew Wilcox:
> On Tue, Jul 11, 2023 at 11:07:06AM +0200, Christian Borntraeger wrote:
>> Am 10.07.23 um 22:43 schrieb Matthew Wilcox (Oracle):
>>> This patchset changes the API used by the MM to set up page table entries.
>>> The four APIs are:
>>>       set_ptes(mm, addr, ptep, pte, nr)
>>>       update_mmu_cache_range(vma, addr, ptep, nr)
>>>       flush_dcache_folio(folio)
>>>       flush_icache_pages(vma, page, nr)
>>>
>>> flush_dcache_folio() isn't technically new, but no architecture
>>> implemented it, so I've done that for them.  The old APIs remain around
>>> but are mostly implemented by calling the new interfaces.
>>>
>>> The new APIs are based around setting up N page table entries at once.
>>> The N entries belong to the same PMD, the same folio and the same VMA,
>>> so ptep++ is a legitimate operation, and locking is taken care of for
>>> you.  Some architectures can do a better job of it than just a loop,
>>> but I have hesitated to make too deep a change to architectures I don't
>>> understand well.
>>>
>>> One thing I have changed in every architecture is that PG_arch_1 is now a
>>> per-folio bit instead of a per-page bit.  This was something that would
>>> have to happen eventually, and it makes sense to do it now rather than
>>> iterate over every page involved in a cache flush and figure out if it
>>> needs to happen.
>>
>> I think we do use PG_arch_1 on s390 for our secure page handling and
>> making this perf folio instead of physical page really seems wrong
>> and it probably breaks this code.
> 
> Per-page flags are going away in the next few years, so you're going to
> need a new design.  s390 seems to do a lot of unusual things.  I wish
> you'd talk to the rest of us more.

I understand you point from a logical point of view, but a 4k page frame
is also a hardware defined memory region. And I think not only for us.
How do you want to implement hardware poisoning for example?
Marking the whole folio with PG_hwpoison seems wrong.