lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d68af630-7fda-439f-98e2-6d7381c1bff1@lucifer.local>
Date: Mon, 24 Feb 2025 14:02:26 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Suren Baghdasaryan <surenb@...gle.com>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Matthew Wilcox <willy@...radead.org>, Vlastimil Babka <vbabka@...e.cz>,
        "Paul E . McKenney" <paulmck@...nel.org>, Jann Horn <jannh@...gle.com>,
        David Hildenbrand <david@...hat.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Shuah Khan <shuah@...nel.org>,
        linux-kselftest@...r.kernel.org, linux-api@...r.kernel.org,
        John Hubbard <jhubbard@...dia.com>, Juan Yescas <jyescas@...gle.com>,
        Kalesh Singh <kaleshsingh@...gle.com>
Subject: Re: [PATCH 1/4] mm: allow guard regions in file-backed and read-only
 mappings

One thing to note with this series is that it now implies file-backed VMAs
which install guard regions will now have an anon_vma installed if not
already present (i.e. if not post-CoW MAP_PRIVATE).

I have audited kernel source for instances of vma->anon_vma checks and
found nowhere where this would be problematic for pure file-backed mappings.

I also discussed (off-list) with Matthew who confirmed he can't see any
issue with this.

In effect, we treat these VMAs as if they are MAP_PRIVATE, only with 0
CoW'd pages. As a result, the rmap never has a reason to reference the
anon_vma from folios at any point and thus no unexpected or weird behaviour
results.

The anon_vma logic tries to avoid unnecessary anon_vma propagation on fork
so we ought to at least minimise overhead.

However, this is still overhead, and unwelcome overhead.

The whole reason we do this (in madvise_guard_install()) is to ensure that
fork _copies page tables_. Otherwise, in vma_needs_copy(), nothing will
indicate that we should do so.

This was already an unpleasant thing to have to do, but without a new VMA
flag we really have no reasonable means of ensuring this happens.

Going forward, I intend to add a new VMA flag, VM_MAYBE_GUARDED or
something like this.

This would have specific behaviour - for the purposes of merging, it would
be ignored. However on both split and merge, it will be propagated. It is
therefore 'sticky'.

This is to avoid having to traverse page tables to determine which parts of
a VMA contain guard regions and of course to maintain the desirable
qualities of guard regions - the lack of VMA propagation (+ thus slab
allocations of VMAs).

Adding this flag and adjusting vma_needs_copy() to reference it would
resolve the issue.

However :) we have a VMA flag space issue, so it'd render this a 64-bit
feature only.

Having discussed with Matthew a plan by which to perhaps extend available
flags for 32-bit we may going forward be able to avoid this. But this may
be a longer term project.

In the meantime, we'd have to resort to the anon_vma hack for 32-bit, using
the flag for 64-bit. The issue with this however is if we do then intend to
allow the flag to enable /proc/$pid/maps visibility (something this could
allow), it would also end up being 64-bit only which would be a pity.

Regardless - I wanted to highlight this behaviour as it is perhaps somewhat
surprising.

On Thu, Feb 13, 2025 at 06:17:00PM +0000, Lorenzo Stoakes wrote:
> There is no reason to disallow guard regions in file-backed mappings -
> readahead and fault-around both function correctly in the presence of PTE
> markers, equally other operations relating to memory-mapped files function
> correctly.
>
> Additionally, read-only mappings if introducing guard-regions, only
> restrict the mapping further, which means there is no violation of any
> access rights by permitting this to be so.
>
> Removing this restriction allows for read-only mapped files (such as
> executable files) correctly which would otherwise not be permitted.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> ---
>  mm/madvise.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 6ecead476a80..e01e93e179a8 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1051,13 +1051,7 @@ static bool is_valid_guard_vma(struct vm_area_struct *vma, bool allow_locked)
>  	if (!allow_locked)
>  		disallowed |= VM_LOCKED;
>
> -	if (!vma_is_anonymous(vma))
> -		return false;
> -
> -	if ((vma->vm_flags & (VM_MAYWRITE | disallowed)) != VM_MAYWRITE)
> -		return false;
> -
> -	return true;
> +	return !(vma->vm_flags & disallowed);
>  }
>
>  static bool is_guard_pte_marker(pte_t ptent)
> --
> 2.48.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ