lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAE8KmOyffXD4k69vRAFwesaqrBGzFY3i+kefbkHcQf4=jNYzOA@mail.gmail.com>
Date: Fri, 29 Dec 2023 23:28:55 +0530
From: Prasad Pandit <ppandit@...hat.com>
To: kvm@...r.kernel.org
Cc: linux-kernel@...r.kernel.org, Sean Christopherson <seanjc@...gle.com>
Subject: Fwd: About patch bdedff263132 - KVM: x86: Route pending NMIs

Hello Sean,

On Tue, 31 Oct 2023 at 17:45, Prasad Pandit <ppandit@...hat.com> wrote:
> On Mon, 30 Oct 2023 at 20:41, Sean Christopherson <seanjc@...gle.com> wrote:
>>> -               kvm_make_request(KVM_REQ_NMI, vcpu);
>>> +               if (events->nmi.pending)
>>> +                       kvm_make_request(KVM_REQ_NMI, vcpu);
> >
> > This looks sane, but it should be unnecessary as KVM_REQ_NMI nmi_queued=0 should
> > be a (costly) nop.  Hrm, unless the vCPU is in HLT, in which case KVM will treat
> > a spurious KVM_REQ_NMI as a wake event.  When I made this change, my assumption
> > was that userspace would set KVM_VCPUEVENT_VALID_NMI_PENDING iff there was
> > relevant information to process.  But if I'm reading the code correctly, QEMU
> > invokes KVM_SET_VCPU_EVENTS with KVM_VCPUEVENT_VALID_NMI_PENDING at the end of
> > machine creation.
> >

QEMU:
qemu_thread_start
 kvm_start_vcpu_thread
  kvm_vcpu_thread_fn
   kvm_cpu_exec
    kvm_arch_put_registers
     kvm_put_vcpu_events (cpu=..., level=1)

qemu_thread_start (args=0x559fdc852110) at ../util/qemu-thread-posix.c:534
 kvm_vcpu_thread_fn (arg=0x559fdc84cdc0) at ../accel/kvm/kvm-accel-ops.c:56
  qemu_wait_io_event (cpu=0x559fdc84cdc0) at ../softmmu/cpus.c:435
   qemu_wait_io_event_common (cpu=0x559fdc84cdc0) at ../softmmu/cpus.c:411
    process_queued_cpu_work (cpu=0x559fdc84cdc0) at ../cpus-common.c:351
     do_kvm_cpu_synchronize_post_reset (cpu=0x559fdc84cdc0, arg=...)
at ../accel/kvm/kvm-all.c:2808
      kvm_arch_put_registers (cpu=0x559fdc84cdc0, level=2) at
../target/i386/kvm/kvm.c:4664
       kvm_put_vcpu_events (cpu=0x559fdc84cdc0, level=2) at
../target/i386/kvm/kvm.c:4308

qemu_thread_start (args=0x559fdc852110) at ../util/qemu-thread-posix.c:534
 kvm_vcpu_thread_fn (arg=0x559fdc84cdc0) at ../accel/kvm/kvm-accel-ops.c:56
  qemu_wait_io_event (cpu=0x559fdc84cdc0) at ../softmmu/cpus.c:435
   qemu_wait_io_event_common (cpu=0x559fdc84cdc0) at ../softmmu/cpus.c:411
    process_queued_cpu_work (cpu=0x559fdc84cdc0) at ../cpus-common.c:351
     do_kvm_cpu_synchronize_post_init (cpu=0x559fdc84cdc0, arg=...) at
../accel/kvm/kvm-all.c:2819
      kvm_arch_put_registers (cpu=0x559fdc84cdc0, level=3) at
../target/i386/kvm/kvm.c:4664
       kvm_put_vcpu_events (cpu=0x559fdc84cdc0, level=3) at
../target/i386/kvm/kvm.c:4308

Kernel:
  kvm_vcpu_ioctl
   mutex_lock_killable(&vcpu->mutex)
    kvm_arch_vcpu_ioctl(, KVM_SET_VCPU_EVENTS, ... )
   mutex_unlock(&vcpu->mutex);
     -> kvm_vcpu_ioctl_x86_set_vcpu_events()

* Above are 3 different ways in which KVM_SET_VCPU_EVENTS ioctl(2) gets called.
        QEMU/target/i386/kvm/kvm.c: kvm_put_vcpu_events()
         if (level >= KVM_PUT_RESET_STATE) {
             events.flags |= KVM_VCPUEVENT_VALID_NMI_PENDING;
  But KVM_VCPUEVENT_VALID_NMI_PENDING is set only when level >=
2(KVM_PUT_RESET_STATE). ie. in the first (level=1) case _NMI_PENDING
is not set.

* In the real-time host set-up I have, KVM_VCPUEVENT_VALID_NMI_PENDING
is called twice for each VCPU and after that kernel goes into what
looks like a lock contention loop. Each time
KVM_VCPUEVENT_VALID_NMI_PENDING is called with 'cpu->env->nmi_injected
= 0' and  'cpu->env->nmi_pending = 0'.  ie. for each VCPU two NMI
events are injected via - kvm_make_request(KVM_REQ_NMI, vcpu), when
vcpu has no NMIs pending.

# perf lock report -t
                Name   acquired  contended     avg wait   total wait
  max wait     min wait

           CPU 3/KVM     154017     154017     62.19 us      9.58 s
 101.01 us      1.49 us
           CPU 9/KVM     152796     152796     62.67 us      9.58 s
  95.92 us      1.49 us
           CPU 7/KVM     151554     151554     63.16 us      9.57 s
 102.70 us      1.48 us
           CPU 1/KVM     151273     151273     65.30 us      9.88 s
  98.88 us      1.52 us
           CPU 6/KVM     151107     151107     63.34 us      9.57 s
 107.64 us      1.50 us
           CPU 8/KVM     151038     151038     63.37 us      9.57 s
 102.93 us      1.51 us
           CPU 2/KVM     150701     150701     63.52 us      9.57 s
  99.24 us      1.50 us
           CPU 5/KVM     150695     150695     63.56 us      9.58 s
 142.15 us      1.50 us
           CPU 4/KVM     150527     150527     63.60 us      9.57 s
 102.04 us      1.44 us
     qemu-system-x86        665        665     65.92 us     43.84 ms
 100.67 us      1.55 us
           CPU 0/KVM             2          2    210.46 us    420.92
us    411.89 us      9.03 us
     qemu-system-x86          1          1    404.91 us    404.91 us
 404.91 us    404.91 us
        TC tc-pc.ram               1          1    414.22 us    414.22
us    414.22 us    414.22 us
  === output for debug===
bad: 10, total: 13
bad rate: 76.92 %
histogram of events caused bad sequence
    acquire: 0
   acquired: 10
  contended: 0
    release: 0


* VCPU#0 thread seems to wait indefinitely to get
qemu_mutex_iothread_lock() to make any progress. The proposed patch
above to check 'events->nmi_pending' for non-zero value helps to fix
this issue.

...wdyt?

Thank you.
---
  - Prasad
PS: The kvm_make_request() routine has following comment, I wonder if
this is what is happening with empty NMI events.
         Request that don't require vCPU action should never be logged in
         vcpu->requests.  The vCPU won't clear the request, so it will stay
         logged indefinitely and prevent the vCPU from entering the guest.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ