lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o6rszrnp.ffs@tglx>
Date: Tue, 02 Sep 2025 19:32:42 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, LKML
 <linux-kernel@...r.kernel.org>
Cc: Jens Axboe <axboe@...nel.dk>, Peter Zijlstra <peterz@...radead.org>,
 "Paul E. McKenney" <paulmck@...nel.org>, Boqun Feng
 <boqun.feng@...il.com>, Paolo Bonzini <pbonzini@...hat.com>, Sean
 Christopherson <seanjc@...gle.com>, Wei Liu <wei.liu@...nel.org>, Dexuan
 Cui <decui@...rosoft.com>, x86@...nel.org, Arnd Bergmann <arnd@...db.de>,
 Heiko Carstens <hca@...ux.ibm.com>, Christian Borntraeger
 <borntraeger@...ux.ibm.com>, Sven Schnelle <svens@...ux.ibm.com>, Huacai
 Chen <chenhuacai@...nel.org>, Paul Walmsley <paul.walmsley@...ive.com>,
 Palmer Dabbelt <palmer@...belt.com>
Subject: Re: [patch V2 25/37] rseq: Rework the TIF_NOTIFY handler

On Tue, Aug 26 2025 at 11:12, Mathieu Desnoyers wrote:
> On 2025-08-23 12:40, Thomas Gleixner wrote:
>> +void __rseq_handle_notify_resume(struct pt_regs *regs)
>> +{
>> +	/*
>> +	 * If invoked from hypervisors before entering the guest via
>> +	 * resume_user_mode_work(), then @regs is a NULL pointer.
>> +	 *
>> +	 * resume_user_mode_work() clears TIF_NOTIFY_RESUME and re-raises
>> +	 * it before returning from the ioctl() to user space when
>> +	 * rseq_event.sched_switch is set.
>> +	 *
>> +	 * So it's safe to ignore here instead of pointlessly updating it
>> +	 * in the vcpu_run() loop.
>
> I don't think any virt user should expect the userspace fields to be
> updated on the host process while running in guest mode, but it's good
> to clarify that we intend to change this user-visible behavior within
> this series, to spare any unwelcome surprise.

Actually it is not really a user-visible change.

TLS::rseq is thread local and any update to it becomes only visible to
user space once the vCPU thread actually returns to user space. Arguably
no guest has legitimately access to the hosts VCPU thread's TLS.

You might argue, that GDB might look at the thread's TLS::rseq while the
task runs in VCPUs guest mode. But that's completely irrelevant because
once a task enters the kernel the RSEQ CPU/NODE/MM ids have no meaning
anymore. They are only valid as long as the task runs in user space.
When a task hits a breakpoint GDB can only look at the state _before_
that and that's all what it can see when it looks at the TLS of a
thread, which voluntarily went into the kernel via the KVM ioctl.

That update is truly a kernel internal implementation detail and it got
introduced way _after_ the initial RSEQ implementation.

Before 5.9 KVM ignored most of the pending TIF work including
TIF_NOTIFY_RESUME. Once that got fixed it turned out that handling the
other TIF_NOTIFY_RESUME work could result in losing an RSEQ update. To
cure that the rseq handler got pulled in to that TIF_NOTIFY_RESUME
demultiplexing function and gained that NULL pointer check inside to
exclude the critical section check.

In hindsight RSEQ should have used a separate TIF bit right from the
beginning, but that's water under the bridge...

Thanks,

        tglx



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ