[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YmRI7Bh7fWCYLUGT@google.com>
Date: Sat, 23 Apr 2022 18:43:56 +0000
From: Oliver Upton <oupton@...gle.com>
To: Gavin Shan <gshan@...hat.com>
Cc: kvmarm@...ts.cs.columbia.edu, linux-kernel@...r.kernel.org,
eauger@...hat.com, Jonathan.Cameron@...wei.com,
vkuznets@...hat.com, will@...nel.org, shannon.zhaosl@...il.com,
james.morse@....com, mark.rutland@....com, maz@...nel.org,
pbonzini@...hat.com, shan.gavin@...il.com
Subject: Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization
infrastructure
On Sat, Apr 23, 2022 at 10:18:49PM +0800, Gavin Shan wrote:
> Hi Oliver,
>
> On 4/23/22 5:48 AM, Oliver Upton wrote:
> > On Sun, Apr 03, 2022 at 11:38:56PM +0800, Gavin Shan wrote:
> > > Software Delegated Exception Interface (SDEI) provides a mechanism
> > > for registering and servicing system events, as defined by ARM DEN0054C
> > > specification. One of these events will be used by Asynchronous Page
> > > Fault (Async PF) to deliver notifications from host to guest.
> > >
> > > The events are classified into shared and private ones according to
> > > their scopes. The shared events are system or VM scoped, but the
> > > private events are CPU or VCPU scoped. The shared events can be
> > > registered, enabled, unregistered and reset through hypercalls
> > > issued from any VCPU. However, the private events are registered,
> > > enabled, unregistered and reset on the calling VCPU through
> > > hypercalls. Besides, the events are also classified into critical
> > > and normal events according their priority. During event delivery
> > > and handling, the normal event can be preempted by another critical
> > > event, but not in reverse way. The critical event is never preempted
> > > by another normal event.
> >
> > We don't have any need for critical events though, right? We should avoid
> > building out the plumbing around the concept of critical events until
> > there is an actual use case for it.
> >
>
> The Async PF one is critical event, as guest needs to handle it immediately.
But that's the sticking point for me. IIUC, we're going to deliver an
aync PF SDEI event to the PE that is waiting on a page so it can go do
something else and wait for the page to come in. Normal events preempt
~everything, critical events preempt even normal events.
How can the guest context switch and do something better at an arbitrary
instruction boundary (such as in an SDEI handler of normal priority)? If
a guest takes a page fault in that context, it may as well wait
synchronously for the page to come in.
And in the case of the page ready event, we still need to clean up shop
before switching to the unblocked context.
> Otherwise, it's possible that guest can't continue its execution. Besides,
> the software signaled event (0x0) is normal event. They're the only two
> events to be supported, I assume the software signaled event (0x0) is only
> used selftest/kvm. So Async PF one becomes the only event and it can be
> in normal priority until other SDEI event needs to be added and supported.
I believe there are multiple use cases for guest-initiated SDEI events
beyond just testing. Poking a hung PE but one example.
> However, the logic to support critical/normal events has been here. So
> I think it's probably nice to keep it. At least, it make it easier to
> add a new SDEI event in future. We dropped the support for the shared
> event from v5 to v6, I think we probably never need a shared event for
> ever :)
But then we're sprinkling a lot of dead code throughout KVM, right? It
makes KVM's job even easier if it doesn't have to worry about nesting
SDEI events.
> > > +struct kvm_sdei_exposed_event {
> > > + unsigned int num;
> > > + unsigned char type;
> > > + unsigned char signaled;
> >
> > what is this used for?
> >
>
> It indicates the event can be raised by software or not. For those
> events exposed by KVM should be raised by software, so this should
> always be true.
Isn't there always going to be some piece of software that raises an
event?
For KVM, we have guest-initiated 'software-signaled' events and KVM-initiated
async PF (whatever else may follow as well).
> > Do we need this if we disallow nesting events?
> >
>
> Yes, we need this. "event == NULL" is used as indication of invalid
> context. @event is the associated SDEI event when the context is
> valid.
What if we use some other plumbing to indicate the state of the vCPU? MP
state comes to mind, for example.
> > > +/*
> > > + * According to SDEI specification (v1.1), the event number spans 32-bits
> > > + * and the lower 24-bits are used as the (real) event number. I don't
> > > + * think we can use that much event numbers in one system. So we reserve
> > > + * two bits from the 24-bits real event number, to indicate its types:
> > > + * physical or virtual event. One reserved bit is enough for now, but
> > > + * two bits are reserved for possible extension in future.
> > > + *
> > > + * The physical events are owned by firmware while the virtual events
> > > + * are used by VMM and KVM.
> >
> > Doesn't KVM own everything? I don't see how the guest could interact
> > with another SDEI implementation.
> >
>
> I might be overthinking on the scheme. The host's firmware might have
> SDEI supported and we want to propogate these events originated from
> host's firmware to guest. In this case, we need to distinguish the events
> originated from host's firmware and kvm (guest's firmware). Even this
> case isn't possible to happen, I think it's still nice to distinguish
> the events originated from a real firmware or KVM emulated firmware.
The guest ABI w.r.t. SDEI is under full ownership of KVM. Any other
implementations events will never get exposed to the guest.
Couldn't the guest own the host if it was talking to our firmware
anyway?
> > > + */
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_SHIFT 22
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_MASK (3 << KVM_SDEI_EVENT_NUM_TYPE_SHIFT)
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_PHYS 0
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_VIRT 1
> > > +
> > > +static inline bool kvm_sdei_is_virtual(unsigned int num)
> > > +{
> > > + unsigned int type;
> > > +
> > > + type = (num & KVM_SDEI_EVENT_NUM_TYPE_MASK) >>
> > > + KVM_SDEI_EVENT_NUM_TYPE_SHIFT;
> > > + if (type == KVM_SDEI_EVENT_NUM_TYPE_VIRT)
> > > + return true;
> > > +
> > > + return false;
> > > +}
> > > +
> > > +static inline bool kvm_sdei_is_sw_signaled(unsigned int num)
> > > +{
> > > + return num == SDEI_SW_SIGNALED_EVENT;
> > > +}
> >
> > Couldn't the caller just check the event number on their own?
> >
>
> It would be hard because the caller can be guest. Generally, the
> event and its associated information/state are accessed by hypercalls,
> event injection and delivery, migration to be supported in future.
> So I think it's good to check the event number by ourselves.
What I'm saying is, can't the caller of kvm_sdei_is_sw_signaled() just
do the comparison?
--
Thanks,
Oliver
Powered by blists - more mailing lists