lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4d4e5645-4443-c233-6d25-97e68d804512@redhat.com>
Date:   Thu, 24 Mar 2022 14:54:00 +0800
From:   Gavin Shan <gshan@...hat.com>
To:     Oliver Upton <oupton@...gle.com>
Cc:     kvmarm@...ts.cs.columbia.edu, maz@...nel.org,
        linux-kernel@...r.kernel.org, eauger@...hat.com,
        shan.gavin@...il.com, Jonathan.Cameron@...wei.com,
        pbonzini@...hat.com, vkuznets@...hat.com, will@...nel.org
Subject: Re: [PATCH v5 02/22] KVM: arm64: Add SDEI virtualization
 infrastructure

Hi Oliver,

On 3/24/22 1:11 AM, Oliver Upton wrote:
> More comments, didn't see exactly how all of these structures are
> getting used.
> 

Ok, thanks for your review and comments.

> On Tue, Mar 22, 2022 at 04:06:50PM +0800, Gavin Shan wrote:
> 
> [...]
> 
>> diff --git a/arch/arm64/include/uapi/asm/kvm_sdei_state.h b/arch/arm64/include/uapi/asm/kvm_sdei_state.h
>> new file mode 100644
>> index 000000000000..b14844230117
>> --- /dev/null
>> +++ b/arch/arm64/include/uapi/asm/kvm_sdei_state.h
>> @@ -0,0 +1,72 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>> +/*
>> + * Definitions of various KVM SDEI event states.
>> + *
>> + * Copyright (C) 2022 Red Hat, Inc.
>> + *
>> + * Author(s): Gavin Shan <gshan@...hat.com>
>> + */
>> +
>> +#ifndef _UAPI__ASM_KVM_SDEI_STATE_H
>> +#define _UAPI__ASM_KVM_SDEI_STATE_H
>> +
>> +#ifndef __ASSEMBLY__
>> +#include <linux/types.h>
>> +
>> +/*
>> + * The software signaled event is the default one, which is
>> + * defined in v1.1 specification.
>> + */
>> +#define KVM_SDEI_INVALID_EVENT	0xFFFFFFFF
> 
> Isn't the constraint that bit 31 must be zero? (DEN 0054C 4.4 "Event
> number allocation")
> 

Yes, bit 31 of the event number should be zero. So this is invalid
event number, used by struct kvm_sdei_vcpu_state::critical_num
and normal_num to indicate if there is event being handled on the
corresponding vcpu. When those fields are set to KVM_SDEI_INVALID_EVENT,
no event is being handled on the vcpu.

>> +#define KVM_SDEI_DEFAULT_EVENT	0
>> +
>> +#define KVM_SDEI_MAX_VCPUS	512	/* Aligned to 64 */
>> +#define KVM_SDEI_MAX_EVENTS	128
> 
> I would *strongly* recommend against having these limits. I find the
> vCPU limit especially concerning, because we're making KVM_MAX_VCPUS
> ABI, which it definitely is not. Anything that deals with a vCPU should
> be accessed through a vCPU FD (and thus agnostic to the maximum number
> of vCPUs) to avoid such a complication.
> 

For KVM_SDEI_DEFAULT_EVENT, which corresponds to the software signaled
event. As you suggested on PATCH[15/22], we can't assume its usage.
I will define it with SDEI_SW_SIGNALED_EVENT in uapi/linux/arm_sdei.h

For KVM_SDEI_MAX_EVENTS, it will be moved from this header file to
kvm_sdei.h after static arrays to hold the data structures or their
pointers are used, as you suggested early for this patch (PATCH[02/22]).

There are two types of (SDEI) events: shared and private. For the private
event, it can be registered independently from the vcpus. It also means
the address and argument for the entry points, corresponding to @ep_address
and @ep_arg in struct kvm_sdei_registered_event, can be different on
the individual vcpus. However, all the registered/enabled states and
the entry point address and argument are same on all vcpus for the shared
event. KVM_SDEI_MAX_VCPUS was introduced to use same data structure to
represent both shared and private event.

If the data belongs to particular vcpu should be accessed through the
vcpu fd, then we need to split or reorganize the data struct as below.

     /*
      * The events are exposed through ioctl interface or similar
      * mechanism (synthetic system registers?) before they can be
      * registered. struct kvm_sdei_exposed_event instance is reserved
      * from the kvm's static array on receiving the ioctl command
      * from VMM.
      */
     struct kvm_sdei_exposed_event {
         __u32   num;

         __u8    type;
         __u8    signaled;
         __u8    priority;
         __u8    padding;
     };

     /*
      * The struct kvm_sdei_registered_event instance is allocated or
      * reserved from the static array. For the shared event, the instance
      * is linked to kvm, but it will be allocated or reserved from vcpu's
      * static array and linked to the vcpu if it's a private event.
      *
      * The instance is only allocated and reserved upon SDEI_EVENT_REGISTER
      * hypercall.
      */
     struct kvm_sdei_registered_event {
         __u32   num

#define KVM_SDEI_EVENT_STATE_REGISTERED         (1 << 0)
#define KVM_SDEI_EVENT_STATE_ENABLE             (1 << 1)
#define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING (1 << 2)
         __u8    state;
         __u8	route_mode;
         __u8    padding[2];
         __u64   route_affinity;
         __u64	ep_address;
         __u64	ep_arg;
         __u64   notifier;
     }

>> +struct kvm_sdei_exposed_event_state {
>> +	__u64	num;
>> +
>> +	__u8	type;
>> +	__u8	signaled;
>> +	__u8	priority;
>> +	__u8	padding[5];
>> +	__u64	notifier;
> 
> Wait, isn't this a kernel function pointer!?
> 

Yeah, it is a kernel function pointer, used by Async PF to know if
the corresponding event has been handled or not. Async PF can cancel
the previously injected event for performance concerns. Either Async PF
or SDEI needs to migrate it. To keep SDEI transparent enough to Async PF,
SDEI is responsible for its migration.

>> +};
>> +
>> +struct kvm_sdei_registered_event_state {
> 
> You should fold these fields together with kvm_sdei_exposed_event_state
> into a single 'kvm_sdei_event' structure:
> 

@route_mode and @route_affinity can't be configured or modified until
the event is registered. Besides, they're only valid to the shared
events. For private events, they don't have the routing needs. It means
those two fields would be part of struct kvm_sdei_registered_event instead
of kvm_sdei_exposed_event.


>> +	__u64	num;
>> +
>> +	__u8	route_mode;
>> +	__u8	padding[3];
>> +	__u64	route_affinity;
> 
> And these shouldn't be UAPI at the VM scope. Each of these properties
> could be accessed via a synthetic/'pseudo-firmware' register on a vCPU FD:
> 

They're accessed through vcpu or kvm fd depending on what type the event
is. For the VM-owned shared event, they're accessed through KVM fd. For the
vcpu-owned private event, they're accessed through vcpu fd.

I'm not sure if I catch the idea to have a synthetic register and I'm to
confirm. If I'm correct, you're talking about the "IMPLEMENTATION DEFINED"
system register, whose OP0 and CRn are 0B11 and 0B1x11. If two implementation
defined registers can be adopted, I don't think we need to expose anything
through ABI. All the operations and the needed data can be passed through
the system registers.

     SYS_REG_SDEI_COMMAND
         Receives commands like to expose event, register event and change
         vcpu state etc.
     SYS_REG_SDEI_DATA
         The needed data corresponding to the received command.

However, I'm not positive that synthetic register can be used here. When
Mark Rutland review "PATCH[RFC v1] Async PF support", the implementation
defined registers can't be used in a very limited way. That time, a set
of implementation defined registers are defined to identify the asynchronous
page faults and access to the control data block. However, the idea was
rejected. Later on, Marc recommended SDEI for Async PF.

https://www.spinics.net/lists/kvm-arm/msg40315.html


>> +	__u64	ep_address[KVM_SDEI_MAX_VCPUS];
>> +	__u64	ep_arg[KVM_SDEI_MAX_VCPUS];
>> +	__u64	registered[KVM_SDEI_MAX_VCPUS/64];
>> +	__u64	enabled[KVM_SDEI_MAX_VCPUS/64];
>> +	__u64	unregister_pending[KVM_SDEI_MAX_VCPUS/64];
>> +};
>> +
>> +struct kvm_sdei_vcpu_event_state {
>> +	__u64	num;
>> +
>> +	__u32	event_count;
>> +	__u32	padding;
>> +};
>> +
>> +struct kvm_sdei_vcpu_regs_state {
>> +	__u64	regs[18];
>> +	__u64	pc;
>> +	__u64	pstate;
>> +};
>> +
>> +struct kvm_sdei_vcpu_state {
> 
> Same goes here, I strongly recommend you try to expose this through the
> KVM_{GET,SET}_ONE_REG interface if at all possible since it
> significantly reduces the UAPI burden, both on KVM to maintain it and
> VMMs to actually use it.
> 

Yeah, it's much convenient to use the implementation defined register here.
However, I'm not positive if we can do this. Please see the details I
provided above :)

Thanks,
Gavin


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ