lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zxf2ZK7HS7jL7TQk@google.com>
Date: Tue, 22 Oct 2024 12:00:52 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Maxim Levitsky <mlevitsk@...hat.com>
Cc: kvm@...r.kernel.org, Will Deacon <will@...nel.org>, linux-kernel@...r.kernel.org, 
	Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, 
	Ingo Molnar <mingo@...hat.com>, "H. Peter Anvin" <hpa@...or.com>, Thomas Gleixner <tglx@...utronix.de>, 
	Joerg Roedel <joro@...tes.org>, Suravee Suthikulpanit <suravee.suthikulpanit@....com>, 
	Robin Murphy <robin.murphy@....com>, iommu@...ts.linux.dev, 
	Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH v3 0/4] Allow AVIC's IPI virtualization to be optional

On Mon, Oct 21, 2024, Sean Christopherson wrote:
> On Wed, Oct 04, 2023, Maxim Levitsky wrote:
> > About the added 'vcpu->loaded' variable, I added it also because it is
> > something that is long overdue to be added, I remember that in IPIv code
> > there was also a need for this, and probalby more places in KVM can be
> > refactored to take advantage of it, instead of various hacks.
> 
> I don't view using the information from the Physical ID table as a hack.  It very
> explicitly uses the ir_list_lock to ensure that the pCPU that's programmed into
> the IRTE is the pCPU on which the vCPU is loaded, and provides rather strict
> ordering between task migration and device assignment.  It's not a super hot path,
> so I don't think lockless programming is justified.
> 
> I also think we should keep IsRunning=1 when the vCPU is unloaded.  That approach
> won't run afoul of your concern with signaling the wrong pCPU, because KVM can
> still keep the ID up-to-date, e.g. if the task is migrated when a pCPU is being
> offlined.
> 
> The motiviation for keeping IsRunning=1 is to avoid unnecessary VM-Exits and GA
> log IRQs.  E.g. if a vCPU exits to userspace, there's zero reason to force IPI
> senders to exit, because KVM can't/won't notify userspace, and the pending virtual
> interrupt will be processed on the next VMRUN.

My only hesitation to keeping IsRunning=1 is that there could, in theory, be a
noisy neighbor problem.  E.g. if there is meaningful overhead when the CPU responds
to the doorbell.  Hrm, and if another vCPU is scheduled in on the same pCPU, that
vCPU could end up processing a virtual interrupt in response to a doorbell intended
for a different vCPU.

The counter-argument to both concerns is that APICv Posted Interrupts have had a
_worse_ version of that behavior for years, and no one has complained.  KVM sets
PID.SN only when a vCPU is _preempted_, and so devices (and now virtual IPIs) will
send notification IRQs to pCPUs that aren't actively running the vCPU, or are
running a different vCPU.

The counter-counter-argument is that (a) IPI virtualization is a recent addition,
and device posted interrupts are unlikely to be used in a CPU oversubscribed setup,
and (b) Posted Interrupts are effectively rate-limited to a single "spurious"
notification per vCPU, as notification IRQs are sent if and only if PID.ON=0.

That said, while I'm somewhat less confident that keeping IsRunning=1 is desirable
for all use cases than I was yesterday, I still think we should avoid tightly
coupling it to whether or not the vCPU is loaded, because there are undoubtedly
setups where it _is_ desirable, e.g. if vCPUs are pinned 1:1 to pCPUs.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ