lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cbea0364-b8e3-4e3b-852f-b16436b60de1@kernel.org>
Date: Thu, 18 Dec 2025 10:49:26 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Ryan Roberts <ryan.roberts@....com>,
 Samuel Holland <samuel.holland@...ive.com>,
 Palmer Dabbelt <palmer@...belt.com>, Paul Walmsley <pjw@...nel.org>,
 linux-riscv@...ts.infradead.org, Andrew Morton <akpm@...ux-foundation.org>,
 linux-mm@...ck.org
Cc: devicetree@...r.kernel.org, Suren Baghdasaryan <surenb@...gle.com>,
 linux-kernel@...r.kernel.org, Mike Rapoport <rppt@...nel.org>,
 Michal Hocko <mhocko@...e.com>, Conor Dooley <conor@...nel.org>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Krzysztof Kozlowski <krzk+dt@...nel.org>, Alexandre Ghiti <alex@...ti.fr>,
 Emil Renner Berthing <kernel@...il.dk>, Rob Herring <robh+dt@...nel.org>,
 Vlastimil Babka <vbabka@...e.cz>, "Liam R . Howlett"
 <Liam.Howlett@...cle.com>
Subject: Re: [PATCH v3 08/22] mm: Allow page table accessors to be
 non-idempotent

On 12/11/25 14:59, Ryan Roberts wrote:
> On 11/12/2025 00:33, Samuel Holland wrote:
>> On 2025-11-28 2:47 AM, David Hildenbrand (Red Hat) wrote:
>>> On 11/27/25 17:57, Ryan Roberts wrote:
>>>> On 13/11/2025 01:45, Samuel Holland wrote:
>>>>> Currently, some functions such as pte_offset_map() are passed both
>>>>> pointers to hardware page tables, and pointers to previously-read PMD
>>>>> entries on the stack. To ensure correctness in the first case, these
>>>>> functions must use the page table accessor function (pmdp_get()) to
>>>>> dereference the supplied pointer. However, this means pmdp_get() is
>>>>> called twice in the second case. This double call must be avoided if
>>>>> pmdp_get() applies some non-idempotent transformation to the value.
>>>>>
>>>>> Avoid the double transformation by calling set_pmd() on the stack
>>>>> variables where necessary to keep set_pmd()/pmdp_get() calls balanced.
>>>>
>>>> I don't think this is a good solution.
>>>
>>> Agreed,
>>>
>>>      set_pmd(&pmd, pmd);
>>>
>>> is rather horrible.
>> I agree that this patch is ugly. The only way I see to avoid code like this is
>> to refactor (or duplicate) the functions so no function takes pointers to both
>> hardware page tables and on-stack page table entries. Is that sort of
>> refactoring the right direction to go for v4?
> 
>  From a quick look at the code, I think that some cases are solvable by
> refactoring to pass the value instead of the pointer, and leave it to the higher
> level decide how to read the value from the pointer - it knows if it is pointing
> to HW pgtable or if it's a (e.g) stack value.
> 
> But the more I look at the code, the more instances I find where pointers to
> stack variables are being passed to arch pgtable helpers as if they are HW
> pgtable entry pointers. (Mainly pmd level).
> 
> I wonder if we need to bite the bullet and explicitly separate the types? At
> each level, we have:
> 
>   1. page table entry value
>   2. pointer to page table entry _value_ (e.g. pointer to pXX_t on stack)
>   3. pointer to page table entry in HW pgtable
> 
> Today, 1 is represented by pte_t, pmd_t, etc. 2 and 3 are represented by the
> same type; pte_t*, pmd_t*, etc.
> 
> If we create a new type for 3, it will both document and enforce when type 2 or
> type 3 is required.
> 
> e.g:
> 
> // pte_t: defined by arch.
> typedef unsigned long pte_t;
> 
> // ptep_t: new opaque type that can't be dereferenced.
> struct __ptep_t;
> typedef struct __ptep_t *ptep_t;

This is what I had in mind when we last discussed this topic and I 
suggested a way forward to not play whack-a-mole with new users that do 
*ptep showing up.

Agreed that we ideally indicate that this is a HW PTE pointer that must 
be de-referenced through ptep_get() or similar. (maybe we'd find a new 
name for this set of functions).

I talked to Samuel at LPC, pointing him at the way XEN-pv implemented 
support for changing PFNs in ptes. We might not need all of that rework 
to move forward with the risc-v change.

Having that said, I also agree that this would be a cleanup worth having 
(which will result in quite a bit of churn :) ).

So if someone wants to bump up the patch count, speak now.

-- 
Cheers

David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ