lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAkAY40UbqzQNr8m@google.com>
Date: Wed, 23 Apr 2025 07:59:47 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: lirongqing <lirongqing@...du.com>
Cc: pbonzini@...hat.com, kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	lizhaoxin <lizhaoxin04@...du.com>
Subject: Re: [PATCH] KVM: Use call_rcu() in kvm_io_bus_register_dev

On Wed, Apr 23, 2025, lirongqing wrote:
> From: Li RongQing <lirongqing@...du.com>
> 
> Use call_rcu() instead of costly synchronize_srcu_expedited(), this
> can reduce the VM bootup time, and reduce VM migration downtime
>
> Signed-off-by: lizhaoxin <lizhaoxin04@...du.com>

If lizhaoxin is a co-author, then this needs:

  Co-developed-by: lizhaoxin <lizhaoxin04@...du.com>

If they are _the_ author, then the From: above is wrong.

> Signed-off-by: Li RongQing <lirongqing@...du.com>
> ---
>  include/linux/kvm_host.h |  1 +
>  virt/kvm/kvm_main.c      | 11 +++++++++--
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 291d49b..e772704 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -203,6 +203,7 @@ struct kvm_io_range {
>  #define NR_IOBUS_DEVS 1000
>  
>  struct kvm_io_bus {
> +	struct rcu_head rcu;
>  	int dev_count;
>  	int ioeventfd_count;
>  	struct kvm_io_range range[];
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 2e591cc..af730a5 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5865,6 +5865,13 @@ int kvm_io_bus_read(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
>  	return r < 0 ? r : 0;
>  }
>  
> +static void free_kvm_io_bus(struct rcu_head *rcu)
> +{
> +	struct kvm_io_bus *bus = container_of(rcu, struct kvm_io_bus, rcu);
> +
> +	kfree(bus);
> +}
> +
>  int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
>  			    int len, struct kvm_io_device *dev)
>  {
> @@ -5903,8 +5910,8 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
>  	memcpy(new_bus->range + i + 1, bus->range + i,
>  		(bus->dev_count - i) * sizeof(struct kvm_io_range));
>  	rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
> -	synchronize_srcu_expedited(&kvm->srcu);
> -	kfree(bus);
> +
> +	call_srcu(&kvm->srcu, &bus->rcu, free_kvm_io_bus);

I don't think this is safe from a functional correctness perspective, as KVM must
guarantee all readers see the new device before KVM returns control to userspace.
E.g. I'm pretty sure KVM_REGISTER_COALESCED_MMIO is used while vCPUs are active.

However, I'm pretty sure the only readers that actually rely on SRCU are vCPUs,
so I _think_ the synchronize_srcu_expedited() is necessary if and only if vCPUs
have been created.

That could race with concurrent vCPU creation in a few flows that don't take
kvm->lock, but that should be ok from an ABI perspective.  False positives (vCPU
creation fails) are benign, and false negatives (vCPU created after the check)
are inherently racy, i.e. userspace can't guarantee the vCPU sees any particular
ordering.

So this?

	if (READ_ONCE(kvm->created_vcpus)) {
		synchronize_srcu_expedited(&kvm->srcu);
		kfree(bus);
	} else {
		call_srcu(&kvm->srcu, &bus->rcu, free_kvm_io_bus);
	}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ