linux-kernel - Re: [PATCH v2 4/6] mm/mremap: Replace can_modify_mm with can_modify

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <lkrt2eatc2cl5ai67femr7xnbewayxrthxz3yjtyazxy4nq3lq@h3qqce5tovvg>
Date: Mon, 12 Aug 2024 11:22:43 -0400
From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
To: Pedro Falcato <pedro.falcato@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, oliver.sang@...el.com,
        torvalds@...ux-foundation.org, jeffxu@...gle.com,
        Michael Ellerman <mpe@...erman.id.au>
Subject: Re: [PATCH v2 4/6] mm/mremap: Replace can_modify_mm with
 can_modify_vma

* Pedro Falcato <pedro.falcato@...il.com> [240809 14:45]:
> On Fri, Aug 9, 2024 at 5:06 PM Liam R. Howlett <Liam.Howlett@...cle.com> wrote:
> >
> > * Pedro Falcato <pedro.falcato@...il.com> [240807 17:13]:
> > > Delegate all can_modify checks to the proper places. Unmap checks are
> > > done in do_unmap (et al).
> > >
> > > This patch allows for mremap partial failure in certain cases (for
> > > instance, when destination VMAs aren't sealed, but the source VMA is).
> > > It shouldn't be too troublesome, as you'd need to go out of your way to
> > > do illegal operations on a VMA.
> >
> > As mseal() is supposed to be a security thing, is the illegal operation
> > not a concern?
> 
> My 3 cents (note: I'm not a security guy):
> 
> - Linux m*() operations have been allowed to partially fail for ages.
> POSIX only disallows this in the munmap case (which is why we need all
> that detached VMA logic), but not in any other case. We have a lot of
> other failure points in these syscalls, and would require extensive
> refactoring to patch this up (very likely with an inevitable
> performance regression, as we saw in this case).
> 
> - Despite allowing for partial failure, this patch set always keeps
> the sealed VMAs untouched (so that invariant isn't broken). The munmap
> semantics remain untouched (and POSIXly correct) due to the detached
> VMA logic.
> 
> - I personally have not heard of a single attack making use of this,
> and the performance hit is very measurable and exists _for every
> user_, despite mseal being a very niche feature (I cannot find a
> single user of mseal upstream, both in debian code search, github,
> chromium, v8, glibc, and what have you).
> 

...

I really have no disagreement with the above statements, but looking at
this further: vma_to_resize() is called in 2 places:
1. mremap() syscall
mremap() calls vma_lookup() and then later calls vma_to_resize() which
also calls vma_lookup() in the first 5 lines of the function.

2. mremap_to() static function
mremap_to() is called only from mreamp(), but earlier than
vma_to_resize().

If we move the vma check to mremap() after finding the vma, then we can
avoid partial failures due to mseal().  We should probably check as much
as possible there, but that change would be too large to fix a
regression.

iow the check was in the wrong place and was the wrong check, but we can
use your check and move it up ~15 lines and everything will be the same
and faster.

For a later patch, there is an opportunity to even make this faster by
passing through the vma to vma_to_resize().  We could remove another
walk of the vma tree.  Probably not necessary to fix the regression, but
it would at least reduce the instruction count - if not a performance
increase (depending on cache use).

Thanks,
Liam