lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 13 Feb 2023 17:05:40 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Mathias Krause <minipli@...ecurity.net>
Cc:     kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH 0/5] KVM: Put struct kvm_vcpu on a diet

On Mon, Feb 13, 2023, Mathias Krause wrote:
> Relayout members of struct kvm_vcpu and embedded structs to reduce its
> memory footprint. Not that it makes sense from a memory usage point of
> view (given how few of such objects get allocated), but this series
> achieves to make it consume two cachelines less, which should provide a
> micro-architectural net win. However, I wasn't able to see a noticeable
> difference running benchmarks within a guest VM -- the VMEXIT costs are
> likely still high enough to mask any gains.

...

> Below is the high level pahole(1) diff. Most significant is the overall
> size change from 6688 to 6560 bytes, i.e. -128 bytes.

While part of me wishes KVM were more careful about struct layouts, IMO fiddling
with per vCPU or per VM structures isn't worth the ongoing maintenance cost.

Unless the size of the vCPU allocation (vcpu_vmx or vcpu_svm in x86 land) crosses
a meaningful boundary, e.g. drops the size from an order-3 to order-2 allocation,
the memory savings are negligible in the grand scheme.  Assuming the kernel is
even capable of perfectly packing vCPU allocations, saving even a few hundred bytes
per vCPU is uninteresting unless the vCPU count gets reaaally high, and at that
point the host likely has hundreds of GiB of memory, i.e. saving a few KiB is again
uninteresting.

And as you observed, imperfect struct layouts are highly unlikely to have a
measurable impact on performance.  The types of operations that are involved in
a world switch are just too costly for the layout to matter much.  I do like to
shave cycles in the VM-Enter/VM-Exit paths, but only when a change is inarguably
more performant, doesn't require ongoing mainteance, and/or also improves the code
quality.

I am in favor in cleaning up kvm_mmu_memory_cache as there's no reason to carry
a sub-optimal layouy and the change is arguably warranted even without the change
in size.  Ditto for kvm_pmu, logically I think it makes sense to have the version
at the very top.

But I dislike using bitfields instead of bools in kvm_queued_exception, and shuffling
fields in kvm_vcpu, kvm_vcpu_arch, vcpu_vmx, vcpu_svm, etc. unless there's a truly
egregious field(s) just isn't worth the cost in the long term.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ