linux-kernel - Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow stacks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ac0ceb09ffaeb1f0925b61ed1b82ee6475df2368.camel@intel.com>
Date: Thu, 25 Sep 2025 23:58:01 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "broonie@...nel.org" <broonie@...nel.org>
CC: "adhemerval.zanella@...aro.org" <adhemerval.zanella@...aro.org>,
	"nsz@...t70.net" <nsz@...t70.net>, "brauner@...nel.org" <brauner@...nel.org>,
	"shuah@...nel.org" <shuah@...nel.org>, "debug@...osinc.com"
	<debug@...osinc.com>, "fweimer@...hat.com" <fweimer@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"catalin.marinas@....com" <catalin.marinas@....com>, "dalias@...c.org"
	<dalias@...c.org>, "jeffxu@...gle.com" <jeffxu@...gle.com>, "will@...nel.org"
	<will@...nel.org>, "yury.khrustalev@....com" <yury.khrustalev@....com>,
	"wilco.dijkstra@....com" <wilco.dijkstra@....com>,
	"linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, "codonell@...hat.com"
	<codonell@...hat.com>, "libc-alpha@...rceware.org"
	<libc-alpha@...rceware.org>, "linux-kselftest@...r.kernel.org"
	<linux-kselftest@...r.kernel.org>
Subject: Re: [PATCH RFC 0/3] arm64/gcs: Allow reuse of user managed shadow
 stacks

On Fri, 2025-09-26 at 00:22 +0100, Mark Brown wrote:
> On Thu, Sep 25, 2025 at 08:40:56PM +0000, Edgecombe, Rick P wrote:
> 
> > Security-wise, it seems reasonable that if you are leaving a shadow stack, that
> > you could leave a token behind. But for the userspace scheme to back up the SSP
> > by doing a longjmp() or similar I have some doubts. IIRC there were some cross
> > stack edge cases that we never figured out how to handle.
> 
> I think those were around the use of alt stacks, which we don't
> currently do for shadow stacks at all because we couldn't figure out
> those edge cases.  Possibly there's others as well, though - the alt
> stacks issues dominated discussion a bit.

For longjmp, IIRC there were some plans to search for a token on the target
stack and use it, which seems somewhat at odds with the quick efficient jump
that longjmp() gets usually used for. But it also doesn't solve the problem of
taking a signal while you are unwinding.

Like say you do calls all the way to the end of a shadow stack, and it's about
to overflow. Then the thread swaps to another shadow stack. If you longjmp back
to the original stack you will have to transition to the end of the first stack
as you unwind. If at that point the thread gets a signal, it would overflow the
shadow stack. This is a subtle difference in behavior compared to non-shadow
stack. You also need to know that nothing else could consume that token on that
stack in the meantime. So doing it safely is not nearly as simple as normal
longjmp().

Anyway, I think you don't need alt shadow stack to hit that. Just normal
userspace threading?

> 
> AFAICT those issues exist anyway, if userspace is already unwinding as
> part of thread exit then they'll exercise that code though perhaps be
> saved from any issues by virtue of not actually doing any function
> calls.  Anything that actually does a longjmp() with the intent to
> continue will do so more thoroughly.
> 
> > As far as re-using allocated shadow stacks, there is always the option to enable
> > WRSS (or similar) to write the shadow stack as well as longjmp at will.
> 
> That's obviously a substantial downgrade in security though.

I don't know about substantial, but I'd love to hear some offensive security
persons analysis. There definitely was a school of thought though, that shadow
stack should be turned on as widely as possible. If we need WRSS to make that
happen in a sane way, you could argue there is sort of a security at scale
benefit.

> 
> > I think we should see a fuller solution from the glibc side before adding new
> > kernel features like this. (apologies if I missed it). I wonder if we are
> 
> I agree that we want to see some userspace code here, I'm hoping this
> can be used for prototyping.  Yury has some code for the clone3() part
> of things in glibc on arm64 already, hopefully that can be extended to
> include the shadow stack in the thread stack cache.
> 
> > building something that will have an extremely complicated set of rules for what
> > types of stack operations should be expected to work.
> 
> I think restricted more than complex?
> 
> > Sort of related, I think we might think about msealing shadow stacks, which will
> > have trouble with a lot of these user managed shadow stack schemes. The reason
> > is that as long as shadow stacks can be unmapped while a thread is on them (say
> > a sleeping thread), a new shadow stack can be allocated in the same place with a
> > token. Then a second thread can consume the token and possibly corrupt the
> > shadow stack for the other thread with it's own calls. I don't know how
> > realistic it is in practice, but it's something that guard gaps can't totally
> > prevent.
> 
> > But for automatic thread created shadow stacks, there is no need to allow
> > userspace to unmap a shadow stack, so the automatically created stacks could
> > simply be msealed on creation and unmapped from the kernel. For a lot of apps
> > (most?) this would work perfectly fine.
> 
> Indeed, we should be able to just do that if we're mseal()ing system
> mappings I think - most likely anything that has a problem with it
> probably already has a problem the existing mseal() stuff.  Yet another
> reason we should be factoring more of this code out into the generic
> code, like I say I'll try to look at that.

Agree. But for the mseal stuff, I think you would want to have map_shadow_stack
not available.

> 
> I do wonder if anyone would bother with those attacks if they've got
> enough control over the process to do them, but equally a lot of this is
> about how things chain together.

Yea, I don't know. But the guard gaps were added after a suggestion from Jann
Horn. This is sort of a similar concern of sharing a shadow stack, but the
difference is stack overflow vs controlling args to syscalls.

> 
> > I think we don't want 100 modes of shadow stack. If we have two, I'd think:
> > 1. Msealed, simple more locked down kernel allocated shadow stack. Limited or
> > none user space managed shadow stacks.
> > 2. WRSS enabled, clone3-preferred max compatibility shadow stack. Longjmp via
> > token writes and don't even have to think about taking signals while unwinding
> > across stacks, or whatever other edge case.
> 
> I think the important thing from a kernel ABI point of view is to give
> userspace the tools to do whatever it wants and get out of the way, and
> that ideally this should include options that don't just make the shadow
> stack writable since that's a substantial step down in protection.

Yes I hear that. But also try to avoid creating maintenance issues by adding
features that didn't turn out to be useful. It sounds like we agree that we need
more proof that this will work out in the long run.

> 
> That said your option 2 is already supported with the existing clone3()
> on both arm64 and x86_64, policy for switching between that and kernel
> managed stacks could be set by restricting the writable stacks flag on
> the enable prctl(), and/or restricting map_shadow_stack().

You mean userspace could already re-use shadow stacks if they enable writable
shadow stacks? Yes I agree.

> 
> > This RFC seems to be going down the path of addressing one edge case at a time.
> > Alone it's fine, but I'd rather punt these types of usages to (2) by default. 
> 
> For me this is in the category of "oh, of course you should be able to
> do that" where it feels like an obvious usability thing than an edge
> case.

True. I guess I was thinking more about the stack unwinding. Badly phrased,
sorry.