linux-kernel - Re: [PATCH 3/4] rseq: Make rseq work with protection keys

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACT4Y+Z3PCBWDdR5PifXoMBSYJ-cdUmzLRdgbjTUG+A2S8Qq1g@mail.gmail.com>
Date: Tue, 18 Feb 2025 16:10:51 +0100
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: peterz@...radead.org, boqun.feng@...il.com, tglx@...utronix.de, 
	mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com, 
	aruna.ramakrishna@...cle.com, elver@...gle.com, 
	"Paul E. McKenney" <paulmck@...nel.org>, x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/4] rseq: Make rseq work with protection keys

On Tue, 18 Feb 2025 at 15:57, Mathieu Desnoyers
<mathieu.desnoyers@...icios.com> wrote:
>
> On 2025-02-18 02:55, Dmitry Vyukov wrote:
> > On Mon, 17 Feb 2025 at 21:21, Mathieu Desnoyers
> [...]
> >>
> >>    we still let this function read the rseq_cs.
> >>> +      * It's unclear what benefits the resticted code gets by doing this
> >>
> >> restricted
> >>
> >>> +      * (it probably already hijacked control flow at this point), and
> >>> +      * presumably any sane sandbox should prohibit restricted code
> >>> +      * from accessing struct rseq, and this is still better than
> >>> +      * terminating the app unconditionally (it always has a choice
> >>> +      * of not using rseq and pkeys together).
> >>
> >> Note that because userspace can complete an rseq critical section
> >> without clearing the rseq_cs pointer, this could happen simply because
> >> the kernel is preempting the task after it has:
> >>
> >> 1) completed an rseq critical section, without clearing rseq_cs,
> >> 2) changed pkey.
> >>
> >> So allowing this is important, and I would remove the comment about
> >> hijacked control flow and such. This can happen with normal use of the
> >> ABI.
> >
> > Thanks for the review!
> >
> > I've addressed all comments in the series in v2.
> >
> > I've reworded this paragraph to simplify sentences, but I still kept
> > the note aboud malicious rseq_cs.
> >
> > If we would not be circumventing normal protection, then, yes, these
> > cases would be the same. But since we are circumventing protection
> > that otherwise exists, I think it's important to think about
> > potentially malicious cases. In this context inaccessible rseq_cs
> > values that resulted from normal execution are very different from
> > malicious onces. Normal ones will point to a fixed set of real
> > well-formed rseq_cs objects, while malicious ones may point to
> > god-knows-where in an attempt of an attacker to do things we can't
> > even imagine right now (e.g. rseq_cs overlapping with protected crypto
> > keys).
> >
> > It's as if a particular instance of copy_to_user would allow
> > user-space to write arbitrary kernel memory, and memory of other
> > processes circumventing all normal protections. In that context we
> > would need to be very careful regarding what we actually allow.
>
> I'm considering that we should clear the rseq_cs pointer whenever
> userspace issues pkey_mprotect.
>
> This would ensure that no legitimate scenario can trigger a load
> from a rseq_cs area which has the wrong pkey, and therefore we
> could accept read/write from/to a struct rseq which has the wrong
> pkey, but kill the process if trying to read/write from a
> struct rseq_cs with the wrong key. This would prevent userspace
> from making the kernel read/write from/to memory with the wrong
> pkey through a pointer it controls (rseq_cs pointer).
>
> Thoughts ?

I am not following.

There are pkey_mprotect calls, then independently installs on rseq_cs
pointers that happen concurrently and after pkey_mprotect, and
independent set of pkey_set calls that happens concurrently and after
the previous 2.
I don't see how doing something at the pkey_mprotect call for the
single thread avoids any scenarios.
Moreover, pkey 0 is preinstalled for all pages, but access to it can
be revoked in future.