[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZJ2sTu9QRmiWNISy@arm.com>
Date: Thu, 29 Jun 2023 17:07:42 +0100
From: "szabolcs.nagy@....com" <szabolcs.nagy@....com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
"Lutomirski, Andy" <luto@...nel.org>
Cc: "Xu, Pengfei" <pengfei.xu@...el.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
"kcc@...gle.com" <kcc@...gle.com>,
"nadav.amit@...il.com" <nadav.amit@...il.com>,
"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
"david@...hat.com" <david@...hat.com>,
"Schimpe, Christina" <christina.schimpe@...el.com>,
"Yang, Weijiang" <weijiang.yang@...el.com>,
"peterz@...radead.org" <peterz@...radead.org>,
"corbet@....net" <corbet@....net>, "nd@....com" <nd@....com>,
"broonie@...nel.org" <broonie@...nel.org>,
"dethoma@...rosoft.com" <dethoma@...rosoft.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
"debug@...osinc.com" <debug@...osinc.com>,
"bp@...en8.de" <bp@...en8.de>,
"rdunlap@...radead.org" <rdunlap@...radead.org>,
"linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
"rppt@...nel.org" <rppt@...nel.org>,
"jamorris@...ux.microsoft.com" <jamorris@...ux.microsoft.com>,
"pavel@....cz" <pavel@....cz>,
"john.allen@....com" <john.allen@....com>,
"bsingharora@...il.com" <bsingharora@...il.com>,
"mike.kravetz@...cle.com" <mike.kravetz@...cle.com>,
"jannh@...gle.com" <jannh@...gle.com>,
"andrew.cooper3@...rix.com" <andrew.cooper3@...rix.com>,
"oleg@...hat.com" <oleg@...hat.com>,
"keescook@...omium.org" <keescook@...omium.org>,
"gorcunov@...il.com" <gorcunov@...il.com>,
"arnd@...db.de" <arnd@...db.de>,
"Yu, Yu-cheng" <yu-cheng.yu@...el.com>,
"fweimer@...hat.com" <fweimer@...hat.com>,
"hpa@...or.com" <hpa@...or.com>,
"mingo@...hat.com" <mingo@...hat.com>,
"hjl.tools@...il.com" <hjl.tools@...il.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"Syromiatnikov, Eugene" <esyr@...hat.com>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"Torvalds, Linus" <torvalds@...ux-foundation.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"Eranian, Stephane" <eranian@...gle.com>
Subject: Re: [PATCH v9 23/42] Documentation/x86: Add CET shadow stack
description
The 06/22/2023 23:18, Edgecombe, Rick P wrote:
> I'd also appreciate if you could spell out exactly which:
> - ucontext
> - signal
> - longjmp
> - custom library stack switching
>
> patterns you think shadow stack should support working together.
> Because even after all these mails, I'm still not sure exactly what you
> are trying to achieve.
i'm trying to support two operations (in any combination):
(1) jump up the current (active) stack.
(2) jump to a live frame in a different inactive but live stack.
the old stack becomes inactive (= no task executes on it)
and live (= has valid frames to jump to).
with
(3) the runtime must manage the shadow stacks transparently.
(= portable c code does not need modifications)
mapping this to c apis:
- swapcontext, setcontext, longjmp, custom stack switching are jump
operations. (there are conditions under which (1) and (2) must work,
further details don't matter.)
- makecontext creates an inactive live stack.
- signal is only special if it executes on an alt stack: on signal
entry the alt stack becomes active and the interrupted stack
inactive but live. (nested signals execute on the alt stack until
that is left either via a jump or signal return.)
- unwinding can be implemented with jump operations (it needs some
other things but that's out of scope here).
the patterns that shadow stack should support falls out of this model.
(e.g. posix does not allow jumping from one thread to the stack of a
different thread, but the model does not care about that, it only
cares if the target stack is inactive and live then jump should work.)
some observations:
- it is necessary for jump to detect case (2) and then switch to the
target shadow stack. this is also sufficient to implement it. (note:
the restore token can be used for detection since that is guaranteed
to be present when user code creates an inactive live stack and is
not present anywhere else by design. a different marking can be used
if the inactive live stack is created by the kernel, but then the
kernel has to provide a switch method, e.g. syscall. this should not
be controversial.)
- in this model two live stacks cannot use the same shadow stack since
jumping between the two stacks is allowed in both directions, but
jumping within a shadow stack only works in one direction. (also two
tasks could execute on the same shadow stack then. and it makes
shadow stack size accounting problematic.)
- so sharing shadow stack with alt stack is broken. (the model is
right in the sense that valid posix code can trigger the issue. we
can ignore that corner case and adjust the model so the shared
shadow stack works for alt stack, but it likely does not change the
jump design: eventually we want alt shadow stack.)
- shadow stack cannot always be managed by the runtime transparently:
it has to be allocated for makecontext and alt stack in situations
where allocation failure cannot be handled. more alarmingly the
destruction of stacks may not be visible to the runtime so the
corresponding shadow stacks leak. my preferred way to fix this is
new apis that are shadow stack compatible (e.g. shadow_makecontext
with shadow_freecontext) and marking the incompatible apis as such.
portable code then can decide to update to new apis, run with shstk
disabled or accept the leaks and OOM failures. the current approach
needs ifdef __CET__ in user code for makecontext and sigaltstack
has many issues.
- i'm still not happy with the shadow stack sizing. and would like to
have a token at the end of the shadow stack to allow scanning. and
it would be nice to deal with shadow stack overflow. and there is
async disable on dlopen. so there are things to work on.
i understand that the proposed linux abi makes most existing binaries
with shstk marking work, which is relevant for x86.
for a while i thought we can fix the remaining issues even if that
means breaking existing shstk binaries (just bump the abi marking).
now it seems the issues can only be addressed in a future abi break.
which means x86 linux will likely end up maintaining two incompatible
abis and the future one will need user code and build system changes,
not just runtime changes. it is not a small incremental change to add
alt shadow stack support for example.
i don't think the maintenance burden of two shadow stack abis is the
right path for arm64 to follow, so the shadow stack semantics will
likely become divergent not common across targets.
i hope my position is now clearer.
Powered by blists - more mailing lists