lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 7 May 2024 10:55:54 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Leonardo Bras <leobras@...hat.com>, Paolo Bonzini <pbonzini@...hat.com>, 
	Frederic Weisbecker <frederic@...nel.org>, Neeraj Upadhyay <quic_neeraju@...cinc.com>, 
	Joel Fernandes <joel@...lfernandes.org>, Josh Triplett <josh@...htriplett.org>, 
	Boqun Feng <boqun.feng@...il.com>, Steven Rostedt <rostedt@...dmis.org>, 
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Lai Jiangshan <jiangshanlai@...il.com>, 
	Zqiang <qiang.zhang1211@...il.com>, Marcelo Tosatti <mtosatti@...hat.com>, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org, rcu@...r.kernel.org
Subject: Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu

On Fri, May 03, 2024, Paul E. McKenney wrote:
> On Fri, May 03, 2024 at 02:29:57PM -0700, Sean Christopherson wrote:
> > So if we're comfortable relying on the 1 second timeout to guard against a
> > misbehaving userspace, IMO we might as well fully rely on that guardrail.  I.e.
> > add a generic PF_xxx flag (or whatever flag location is most appropriate) to let
> > userspace communicate to the kernel that it's a real-time task that spends the
> > overwhelming majority of its time in userspace or guest context, i.e. should be
> > given extra leniency with respect to rcuc if the task happens to be interrupted
> > while it's in kernel context.
> 
> But if the task is executing in host kernel context for quite some time,
> then the host kernel's RCU really does need to take evasive action.

Agreed, but what I'm saying is that RCU already has the mechanism to do so in the
form of the 1 second timeout.

And while KVM does not guarantee that it will immediately resume the guest after
servicing the IRQ, neither does the existing userspace logic.  E.g. I don't see
anything that would prevent the kernel from preempting the interrupt task.

> On the other hand, if that task is executing in guest context (either
> kernel or userspace), then the host kernel's RCU can immediately report
> that task's quiescent state.
> 
> Too much to ask for the host kernel's RCU to be able to sense the
> difference?  ;-)

KVM already notifies RCU when its entering/exiting an extended quiescent state,
via __ct_user_{enter,exit}().

When handling an IRQ that _probably_ triggered an exit from the guest, the CPU
has already exited the quiescent state.  And AFAIK, that can't be safely changed,
i.e. KVM must note the context switch before enabling IRQs.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ