linux-kernel - Re: [PATCHv2 0/2] kvm/x86: vhost task creation failure handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPpAL=yffyhUrdEJHtAw4BDpV-=Z5mZq0r1ZFcoj1v1OBLnp_g@mail.gmail.com>
Date: Thu, 27 Feb 2025 18:21:34 +0800
From: Lei Yang <leiyang@...hat.com>
To: Keith Busch <kbusch@...a.com>
Cc: pbonzini@...hat.com, seanjc@...gle.com, kvm@...r.kernel.org, 
	x86@...nel.org, virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org, 
	Keith Busch <kbusch@...nel.org>
Subject: Re: [PATCHv2 0/2] kvm/x86: vhost task creation failure handling

Hi Keith

There are some error messages from qemu output when I tested this
series of patches with the virtio-net regression test. It can
reproduced by boot up a guest with vhost device after applied your
patches.
Error messages:
Qemu output:
qemu-kvm: -netdev {"id": "idoejzv8", "type": "tap", "vhost": true,
"vhostfd": "16", "fd": "10"}: vhost_set_owner failed: Cannot allocate
memory
qemu-kvm: -netdev {"id": "idoejzv8", "type": "tap", "vhost": true,
"vhostfd": "16", "fd": "10"}: vhost-net requested but could not be
initialized

My tests based on this commit:
# git log -1
commit dd83757f6e686a2188997cb58b5975f744bb7786 (HEAD -> master,
origin/master, origin/HEAD)
Merge: 102c16a1f9a9 eb54d2695b57
Author: Linus Torvalds <torvalds@...ux-foundation.org>
Date:   Wed Feb 26 16:55:30 2025 -0800

    Merge tag 'bcachefs-2025-02-26' of git://evilpiepirate.org/bcachefs

Thanks
Lei



On Thu, Feb 27, 2025 at 5:40 AM Keith Busch <kbusch@...a.com> wrote:
>
> From: Keith Busch <kbusch@...nel.org>
>
> The suggestion from Sean appears to be successful, so sending out a new
> version for consideration.
>
> Background:
>
> The crosvm VMM might send signals to its threads that have entered
> KVM_RUN. The signal specifically is SIGRTRMIN from here:
>
>   https://github.com/google/crosvm/blob/main/src/crosvm/sys/linux/vcpu.rs#L651
>
> If this happens to occur when the huge page recovery is trying to create
> its vhost task, that will fail with ERESTARTNOINTR. Once this happens,
> all KVM_RUN calls will fail with ENOMEM despite memory not being the
> problem.
>
> This series propogates the error up so we can distinguish that from the
> current defaulting to ENOMEM and replaces the call_once since we need to
> be able to call it repeatedly due to this condition.
>
> Changes from the v1 (prefixed as an "RFC", really) patch:
>
>   Instead of using a VM-wide mutex, update the call_once pattern to
>   complete only if what it calls is successful (from Sean).
>
> Keith Busch (1):
>   vhost: return task creation error instead of NULL
>
> Sean Christopherson (1):
>   kvm: retry nx_huge_page_recovery_thread creation
>
>  arch/x86/kvm/mmu/mmu.c    | 12 +++++-------
>  drivers/vhost/vhost.c     |  2 +-
>  include/linux/call_once.h | 16 +++++++++++-----
>  kernel/vhost_task.c       |  4 ++--
>  4 files changed, 19 insertions(+), 15 deletions(-)
>
> --
> 2.43.5
>
>