linux-kernel - Re: [PATCH 3/4] rseq: Make rseq work with protection keys

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2bcfeb11-74a1-497b-b271-06dfb51eace3@efficios.com>
Date: Tue, 18 Feb 2025 10:27:24 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Dmitry Vyukov <dvyukov@...gle.com>
Cc: peterz@...radead.org, boqun.feng@...il.com, tglx@...utronix.de,
 mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com,
 aruna.ramakrishna@...cle.com, elver@...gle.com,
 "Paul E. McKenney" <paulmck@...nel.org>, x86@...nel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/4] rseq: Make rseq work with protection keys

On 2025-02-18 10:10, Dmitry Vyukov wrote:
> On Tue, 18 Feb 2025 at 15:57, Mathieu Desnoyers
> <mathieu.desnoyers@...icios.com> wrote:
>>
>> On 2025-02-18 02:55, Dmitry Vyukov wrote:
>>> On Mon, 17 Feb 2025 at 21:21, Mathieu Desnoyers
>> [...]
>>>>
>>>>     we still let this function read the rseq_cs.
>>>>> +      * It's unclear what benefits the resticted code gets by doing this
>>>>
>>>> restricted
>>>>
>>>>> +      * (it probably already hijacked control flow at this point), and
>>>>> +      * presumably any sane sandbox should prohibit restricted code
>>>>> +      * from accessing struct rseq, and this is still better than
>>>>> +      * terminating the app unconditionally (it always has a choice
>>>>> +      * of not using rseq and pkeys together).
>>>>
>>>> Note that because userspace can complete an rseq critical section
>>>> without clearing the rseq_cs pointer, this could happen simply because
>>>> the kernel is preempting the task after it has:
>>>>
>>>> 1) completed an rseq critical section, without clearing rseq_cs,
>>>> 2) changed pkey.
>>>>
>>>> So allowing this is important, and I would remove the comment about
>>>> hijacked control flow and such. This can happen with normal use of the
>>>> ABI.
>>>
>>> Thanks for the review!
>>>
>>> I've addressed all comments in the series in v2.
>>>
>>> I've reworded this paragraph to simplify sentences, but I still kept
>>> the note aboud malicious rseq_cs.
>>>
>>> If we would not be circumventing normal protection, then, yes, these
>>> cases would be the same. But since we are circumventing protection
>>> that otherwise exists, I think it's important to think about
>>> potentially malicious cases. In this context inaccessible rseq_cs
>>> values that resulted from normal execution are very different from
>>> malicious onces. Normal ones will point to a fixed set of real
>>> well-formed rseq_cs objects, while malicious ones may point to
>>> god-knows-where in an attempt of an attacker to do things we can't
>>> even imagine right now (e.g. rseq_cs overlapping with protected crypto
>>> keys).
>>>
>>> It's as if a particular instance of copy_to_user would allow
>>> user-space to write arbitrary kernel memory, and memory of other
>>> processes circumventing all normal protections. In that context we
>>> would need to be very careful regarding what we actually allow.
>>
>> I'm considering that we should clear the rseq_cs pointer whenever
>> userspace issues pkey_mprotect.
>>
>> This would ensure that no legitimate scenario can trigger a load
>> from a rseq_cs area which has the wrong pkey, and therefore we
>> could accept read/write from/to a struct rseq which has the wrong
>> pkey, but kill the process if trying to read/write from a
>> struct rseq_cs with the wrong key. This would prevent userspace
>> from making the kernel read/write from/to memory with the wrong
>> pkey through a pointer it controls (rseq_cs pointer).
>>
>> Thoughts ?
> 
> I am not following.
> 
> There are pkey_mprotect calls, then independently installs on rseq_cs
> pointers that happen concurrently and after pkey_mprotect, and
> independent set of pkey_set calls that happens concurrently and after
> the previous 2.
> I don't see how doing something at the pkey_mprotect call for the
> single thread avoids any scenarios.

Hrm. Sorry, I mixed up pkey_set() vs pkey_mprotect(). What I had in mind
was actually pkey_set(). And that would need to clear rseq_cs for all
threads belonging to the process, which may not be straightforward
because those could legitimately be inside a rseq critical section.

OK, let's try another approach: rather than kill the process if
read/write of the rseq_cs area with the wrong key fails, could we simply
clear the rseq_cs pointer in that case ? Technically there would be no
legitimate use of this except for the case where it is meant to be lazily
cleared.

Thanks,

Mathieu

> Moreover, pkey 0 is preinstalled for all pages, but access to it can
> be revoked in future.


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com