linux-kernel - Re: [PATCH v1] mm/madvise: Always set ptes via arch helpers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ba694acd-07b7-4f28-828a-19bf4c803ca0@redhat.com>
Date: Fri, 7 Mar 2025 18:43:35 +0100
From: David Hildenbrand <david@...hat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Ryan Roberts <ryan.roberts@....com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, stable@...r.kernel.org,
 Oscar Salvador <osalvador@...e.de>
Subject: Re: [PATCH v1] mm/madvise: Always set ptes via arch helpers

>> It's certainly not read-only in general. Just having a quick look to verify, the
>> very first callback I landed on was clear_refs_pte_range(), which implements
>> .pmd_entry to clear the softdirty and access flags from a leaf pmd or from all
>> the child ptes.
> 
> Yup sorry I misspoke, working some long hours atm so forgive me :) what I meant
> to say is that we either read or modify existing.
> 
> And yes users do do potentially crazy things and yada yada.
> 
> David and I have spoken quite a few times about implementing generic page
> table code that could help abstract a lot of things, and it feels like this
> logic could all be rejigged in some fashion as to prevent the kind of
> 'everybody does their own handler' logic.q

Oscar is currently prototyping a new pt walker that will batch entries 
(e.g., folio ranges, pte none ares), and not use indirect calls. The 
primary focus will will read-traversal, but nothing speaks against 
modifications (likely helpers for installing pte_none()->marker could be 
handy, and just creating page tables if we hit pmd_none() etc.).

Not sure yet how many use cases we can cover with the initial approach. 
But the idea is to start with something that works for many cases, to 
then gradually improve it.


> 
> I guess I felt it was more _dangerous_ as you are establishing _new_
> mappings here, with the page tables being constructed for you up to the PTE
> level.
> 
> And wanted to 'lock things down' somewhat.
> 
> But indeed, all this cries out for a need for a more generalised, robust
> interface that handles some of what the downstream users of this are doing.
> 
>>
>>>
>>> When setting things are a little different, I'd rather not open up things to a
>>> user being able to do *whatever*, but rather limit to the smallest scope
>>> possible for installing the PTE.
>>
>> Understandable, but personally I think it will lead to potential misunderstandings:
>>
>>   - it will get copy/pasted as an example of how to set a pte (which is wrong;
>> you have to use set_pte_at()/set_ptes()). There is currently only a single other
>> case of direct dereferencing a pte to set it (in write_protect_page()).
> 
> Yeah, at least renaming the param could help, as 'ptep' implies you really
> do have a pointer to the page table entry.
> 
> If we didn't return an error we could just return the PTE value or
> something... hm.
> 
>>
>>   - new users of .install_pte may assume (like I did) that the passed in ptep is
>> pointing to the pgtable and they will manipulate it with arch helpers. arm64
>> arch helpers all assume they are only ever passed pointers into pgtable memory.
>> It will end horribly if that is not the case.
> 
> It will end very horribly indeed :P or perhaps with more of a fizzle than
> anticipated...

Yes, I'm hoping we can avoid new users with the old api ... :)

> 
>>
>>>
>>> And also of course, it allows us to _mandate_ that set_pte_at() is used so we do
>>> the right thing re: arches :)
>>>
>>> I could have named the parameter better though, in guard_install_pte_entry()
>>> would be better to have called it 'new_pte' or something.
>>
>> I'd suggest at least describing this in the documentation in pagewalk.h. Or
>> better yet, you could make the pte the return value for the function. Then it is
>> clear because you have no pointer. You'd lose the error code but the only user
>> of this currently can't fail anyway.
> 
> Haha and here you make the same point I did above... great minds :)
> 
> I mean yeah returning a pte would make it clearer what you're doing, but
> then it makes it different from every other callback... but this already is
> different :)
> 
> I do very much want the ability to return an error value to stop the walk
> (if you return >0 you can indicate to caller that a non-error stop occurred
> for instance, something I use on the reading side).
> 
> But we do need to improve this one way or another, at the very least the
> documentation/comments.
> 
> David - any thoughts?

Maybe document "don't use this until we have something better" :D


> 
> I'm not necessarily against just making this consitent, but I like this
> property of us controlling what happens instead of just giving a pointer
> into the page table - the principle of exposing the least possible.
> 
> ANWYAY, I will add to my ever expanding whiteboard TODO list [literally the
> only todo that work for me] to look at this, will definitely improve docs
> at very least.

Maybe we can talk at LSF/MM about this. I assume Oscar might be 
partially covering that in his hugetlb talk.

-- 
Cheers,

David / dhildenb