lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <61f84216-75fa-477b-a9df-6f24476ecd8d@lucifer.local>
Date: Fri, 15 Nov 2024 18:02:39 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Jann Horn <jannh@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
        Jonathan Corbet <corbet@....net>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Vlastimil Babka <vbabka@...e.cz>, Alice Ryhl <aliceryhl@...gle.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Matthew Wilcox <willy@...radead.org>, Mike Rapoport <rppt@...nel.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Hillf Danton <hdanton@...a.com>, Qi Zheng <zhengqi.arch@...edance.com>,
        SeongJae Park <sj@...nel.org>, Bagas Sanjaya <bagasdotme@...il.com>,
        linux-mm@...ck.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, Matteo Rizzo <matteorizzo@...gle.com>
Subject: Re: [PATCH] docs/mm: add more warnings around page table access

On Thu, Nov 14, 2024 at 10:12:00PM +0100, Jann Horn wrote:
> Make it clearer that holding the mmap lock in read mode is not enough
> to traverse page tables, and that just having a stable VMA is not enough
> to read PTEs.
>
> Suggested-by: Matteo Rizzo <matteorizzo@...gle.com>
> Signed-off-by: Jann Horn <jannh@...gle.com>

Have some queries before we move forward so would like a little more
clarification/perhaps putting some extra meat on the bones first.

Broadly very glad you have done this however so it's just sorting details
first! :>)

> ---
> @akpm: Please don't put this in your tree before Lorenzo has replied.
>
> @Lorenzo:
> This is intended to go on top of your documentation patch.
> If you think this is a sensible change, do you prefer to squash it into
> your patch or do you prefer having akpm take this as a separate patch?
> IDK what works better...

I think a new patch is better, as I'd like the original to settle down now
and the whole point of this doc is that it's a living thing that many
people can contribute to, update, etc.

For instance, Suren is updating as part of one of his series to correct
things that he changes in that series, which is really nice.

> ---
>  Documentation/mm/process_addrs.rst | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
> index 1bf7ad010fc063d003bb857bb3b695a3eafa0b55..9bdf073d0c3ebea1707812508a309aa4a6163660 100644
> --- a/Documentation/mm/process_addrs.rst
> +++ b/Documentation/mm/process_addrs.rst
> @@ -339,6 +339,16 @@ When **installing** page table entries, the mmap or VMA lock must be held to
>  keep the VMA stable. We explore why this is in the page table locking details
>  section below.
>
> +.. warning:: Taking the mmap lock in read mode **is not sufficient** for
> +             traversing page tables; you must also ensure that a VMA exists that
> +             covers the range being accessed.

Hm, but we say later we don't need _any_ locks for traversal, and here we
say we need mmap read lock. Do you mean installing page table entries?

Or do you mean to say, that if you don't span a VMA, you must acquire a
write lock at least to preclude this?

This seems quite unclear.

I kind of didn't want to touch on the horrors of fiddling about without a
VMA, so I'd rather this very clearly say something like 'it is unusual to
manipulate page tables wihch are not spanned by a VMA, and there are
special requirements for this operation' etc. et.c otherwise this just adds
more noise and confusion I think.

> +             This ensures you can't race with concurrent page table removal
> +             which happens with the mmap lock in read mode, in regions whose
> +             VMAs are no longer present in the VMA tree.
> +
> +             (Alternatively, the mmap lock can be taken in write mode, but that
> +             is heavy-handed and almost never the right choice.)

You kind of need to expand on why that is I think!

> +
>  **Freeing** page tables is an entirely internal memory management operation and
>  has special requirements (see the page freeing section below for more details).
>
> @@ -450,6 +460,9 @@ the time of writing of this document.
>  Locking Implementation Details
>  ------------------------------
>
> +.. warning:: Locking rules for PTE-level page tables are very different from
> +             locking rules for page tables at other levels.
> +
>  Page table locking details
>  --------------------------
>
> @@ -470,8 +483,12 @@ additional locks dedicated to page tables:
>  These locks represent the minimum required to interact with each page table
>  level, but there are further requirements.
>
> -Importantly, note that on a **traversal** of page tables, no such locks are
> -taken. Whether care is taken on reading the page table entries depends on the
> +Importantly, note that on a **traversal** of page tables, sometimes no such
> +locks are taken. However, at the PTE level, at least concurrent page table
> +deletion must be prevented (using RCU) and the page table must be mapped into
> +high memory, see below.

Ugh I really do hate that we have to think about high memory. I'd like to
sort of deny it exists. But I suppose that's not an option.

As for the RCU thing, I guess this is why pte_offset_map_lock() is taking
it? Maybe worth mentioning something there or updating that 'interestingly'
block... :>)

Or am I mistaken? I wasn't aware of this requirement, is this sort of
implied by the gup_fast() IRQ disabling stuff?

Please expand :)

> +
> +Whether care is taken on reading the page table entries depends on the
>  architecture, see the section on atomicity below.
>
>  Locking rules
>
> ---
> base-commit: 1e96a63d3022403e06cdda0213c7849b05973cd5
> change-id: 20241114-vma-docs-addition1-onv3-32df4e6dffcf
>
> --
> Jann Horn <jannh@...gle.com>
>

Thanks for this, your input is hugely appreciated both in the review and
now this, you're a gem!

Cheers, Lorenzo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ