linux-kernel - Re: [PATCH v4 1/1] exec: seal system mappings

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2e5de601da34342d8eb0d8319dcf81ff213c7ef0.camel@sipsolutions.net>
Date: Thu, 16 Jan 2025 18:01:47 +0100
From: Benjamin Berg <benjamin@...solutions.net>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Jeff Xu
 <jeffxu@...omium.org>
Cc: Kees Cook <kees@...nel.org>, akpm@...ux-foundation.org,
 jannh@...gle.com, 	torvalds@...ux-foundation.org,
 adhemerval.zanella@...aro.org, oleg@...hat.com, 
	linux-kernel@...r.kernel.org, linux-hardening@...r.kernel.org, 
	linux-mm@...ck.org, jorgelo@...omium.org, sroettger@...gle.com,
 ojeda@...nel.org, 	adobriyan@...il.com, anna-maria@...utronix.de,
 mark.rutland@....com, 	linus.walleij@...aro.org, Jason@...c4.com,
 deller@....de, rdunlap@...radead.org, 	davem@...emloft.net, hch@....de,
 peterx@...hat.com, hca@...ux.ibm.com, 	f.fainelli@...il.com,
 gerg@...nel.org, dave.hansen@...ux.intel.com, 	mingo@...nel.org,
 ardb@...nel.org, Liam.Howlett@...cle.com, mhocko@...e.com, 
	42.hyeyoo@...il.com, peterz@...radead.org, ardb@...gle.com, enh@...gle.com,
 	rientjes@...gle.com, groeck@...omium.org, mpe@...erman.id.au, Vlastimil
 Babka	 <vbabka@...e.cz>, Andrei Vagin <avagin@...il.com>, Dmitry Safonov	
 <0x7f454c46@...il.com>, Mike Rapoport <mike.rapoport@...il.com>, Alexander
 Mikhalitsyn <aleksandr.mikhalitsyn@...onical.com>
Subject: Re: [PATCH v4 1/1] exec: seal system mappings

Hi Lorenzo,

On Thu, 2025-01-16 at 15:48 +0000, Lorenzo Stoakes wrote:
> On Wed, Jan 15, 2025 at 12:20:59PM -0800, Jeff Xu wrote:
> > On Wed, Jan 15, 2025 at 11:46 AM Lorenzo Stoakes
> > <lorenzo.stoakes@...cle.com> wrote:
> 
> [SNIP]
> > 
> > > I've made it abundantly clear that this (NACKed) series cannot allow the
> > > kernel to be in a broken state even if a user sets flags to do so.
> > > 
> > > This is because users might lack context to make this decision and
> > > incorrectly do so, and now we ship a known-broken kernel.
> > > 
> > > You are now suggesting disabling the !CRIU requirement. Which violates my
> > > _requirements_ (not optional features).
> > > 
> > Sure, I can add CRIU back.
> > 
> > Are you fine with UML and gViso not working under this CONFIG ?
> > UML/gViso doesn't use any KCONFIG like CRIU does.
> 
> Yeah this is a concern, wouldn't we be able to catch UML with a flag?
> 
> Apologies my fault for maybe not being totally up to date with this, but what
> exactly was the gViso (is it gVisor actually?)

UML is a separate architecture. It is a Linux kernel running as a
userspace application on top of an unmodified host kernel.

So really, UML is a mostly weird userspace program for the purpose of
this discussion. And a pretty buggy one too--it got broken by rseq
already.

What UML now does is:
 * Execute a tiny static binary
 * map special "stub" code/data pages at the topmost userspace address
   (replacing its stack)
 * continue execution inside the "stub" pages
 * unmap everything below the "stub" pages
 * use the unmap'ed area for userspace application mappings

I believe that the "unmap everything" step will fail with this feature.


Now, I am sure one can come up with solutions, e.g.:
   1. Simply print an explanation if the unmap() fails
   2. Find an address that is guaranteed to be below the VDSO and use a
      smaller address space for the UML userspace.
   3. Somehow tell the host kernel to not install the VDSO mappings
   4. Add the host VDSO pages as a sealed VMA within UML to guard them

UML is a bit of a niche and I am not sure it is worth worrying about it
too much.

Benjamin

> 
> > 
> > > You seem to be saying you're pushing an internal feature on upstream and
> > > only care about internal use cases, this is not how upstream works, as
> > > Matthew alludes to.
> > > 
> > > I have told you that my requirements are:
> > > 
> > > 1. You cannot allow a user to set config or boot options to have a
> > >    broken kernel configuration.
> > > 
> > Can you clarify on the definition of "broken kernel configuration":
> 
> Anything that'd unexpected break userland in a way that would be entirely
> unexpected.
> 
> Especially so if there is a real disconnect between the person who is
> enabling the feature and the program.
> 
> For instance if a distro wants to be big on security, is (as is entirely
> reasonable) concerned about an unsealed VDSO/VVAR/etc. being exploited, so
> turns on the flag, but _doesn't realise_ or doesn't communicate (such a big
> problem and difficult actually for many distros/vendors) that this will
> break certain programs - and then users do a kernel update, and *bang*
> their whole system is broken.
> 
> It's really this kind of scenario I'm worried about.
> 
> This is the crux of it really.
> 
> > 
> > Do you consider "setting mseal kernel cmd line under 32 bit build" as broken ?
> > If so, this problem is not solvable and I might just not try to solve
> > it for the next version.
> 
> Yeah, I really don't like the kernel cmd line thing, because of this risk
> of disconnect - your justification for it is prima facie reasonable - the
> distro didn't want to enable the thing by default but you want more
> security - but then we have this issue with the possible disconnect between
> 'hey here is security feature X' vs. 'security feature X breaks Y, Z +
> alpha'.
> 
> > 
> > If you just refer to a need to detect CRIU, in KCONFIG or/and kernel
> > cmd line,  this is solvable.
> > 
> > > 2. You must provide evidence that the arches you claim work with this,
> > >    actually do.
> > > 
> > Sure
> 
> See my reply to Kees as to what this comprises, sorry if I was not clear
> previously.
> 
> 
> > 
> > > You seem to have eliminated that from your summary as if the very thing
> > > that makes this series NACKed were not pertinent.
> > > 
> > In my last email, I tried to cover all code-logic related comments,
> > which is blocking me.
> > I also mentioned I will address non-code related comments
> > (threat-model/test etc),  later.
> 
> Ack.
> 
> I felt that you hadn't hit on my fundamental objections and this was in
> effect - a final analysis as to how you would be moving forward with v5 -
> but apologies if you did intend to separately discuss them.
> 
> > 
> > > if you do not address these correctly, I will simply have to reject your v5
> > > too and it'll waste everybody's time. I _genuinely_ don't want to have to
> > > do this.
> > > 
> > > Any solution MUST fulfil these requirements. I also want to see v5 as an
> > > RFC honestly at this stage, since it seems we are VERY MUCH in a discussion
> > > phase rather than a patch phase at this time.
> > > 
> > Sure.
> 
> To be clear - if the series is viable, I want to see it merged. And to
> further clarify - a simpler, smaller version of this that explicitly
> disallows breakage in config options suffices (though we must clarify the
> gVisor + UML things).
> 
> If I just wanted to reject this outright, I'd tell you :) (I don't).
> 
> I just need to feel vaguely less anxious about breaking things! :)
> 
> > 
> > > I really want to help you improve mseal and get things upstream, but I
> > > can't ignore my duty to ensure that the kernel remains stable and we don't
> > > hand kernel users (overly huge) footguns. I hate to be negative, but this
> > > is why I am pushing back so much here.
> > > 
> > Thanks. You can help me by answering my questions, and clarify your
> > requirements. I appreciate your time to make this feature useful.
> 
> Sure, hopefully I have done so, do follow up if anything was unclear.
> 
> > 
> > Please take note that the security feature often takes away
> > capabilities.  Sometimes it is impossible to meet security, usability
> > or performance goals simultaneously. I'm trying my best to get all
> > aspected satisfied.
> 
> Ack, and I realise it's often a difficult trade-off. I just worry about
> compounding complexity in consequences of kernel configuration vs. userland
> stuff + the disconnect between the two.
> 
> > 
> > -Jeff
> > 
> > > Thanks!
> 
> Cheers, Lorenzo
>