lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YU42b1iwIpZS0iCp@google.com>
Date:   Fri, 24 Sep 2021 20:34:55 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Dongli Zhang <dongli.zhang@...cle.com>
Cc:     kvm@...r.kernel.org, pbonzini@...hat.com, vkuznets@...hat.com,
        wanpengli@...cent.com, jmattson@...gle.com, joro@...tes.org,
        tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, x86@...nel.org,
        hpa@...or.com, linux-kernel@...r.kernel.org, joe.jin@...cle.com
Subject: Re: [PATCH RFC 1/1] kvm: export per-vcpu exits to userspace

On Tue, Sep 07, 2021, Dongli Zhang wrote:
> People sometimes may blame KVM scheduling if there is softlockup/rcu_stall
> in VM kernel. The KVM developers are required to prove that a specific VCPU
> is being regularly scheduled by KVM hypervisor.
> 
> So far we use "pidstat -p <qemu-pid> -t 1" or
> "cat /proc/<pid>/task/<tid>/stat", but 'exits' is more fine-grained.

Sort of?  Yes, counts _almost_ every VM-Exit, but it's also measuring something
completely different.

> Therefore, the 'exits' is exported to userspace to verify if a VCPU is
> being scheduled regularly.

The number of VM-Exits seems like a very cumbersome and potentially misinterpreted
indicator, e.g. userspace could naively think that a guest that is generating a
high number of exits is getting more runtime.  With posted interrupts and other
hardware features, that doesn't necessarily hold true.

I'm not saying don't count exits, they absolutely can be a good triage tool, but
they're not the right tool to verify tasks are getting scheduled.

> I was going to export 'exits', until there was binary stats available.
> Unfortunately, QEMU does not support binary stats and we will need to
> read via debugfs temporarily. This patch can also be backported to prior
> versions that do not support binary stats.

Adding temporary code to the _upstream_ kernel to work around lack of support in
the userspace VMM does not seem right to me.  Especially in debugfs, which is
very explicitly not intended to be used for thing like monitoring in production.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ