lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZUA5nnAV3CxOX9lB@google.com>
Date:   Mon, 30 Oct 2023 16:17:50 -0700
From:   Sean Christopherson <seanjc@...gle.com>
To:     Xiaoyao Li <xiaoyao.li@...el.com>
Cc:     Vitaly Kuznetsov <vkuznets@...hat.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Jonathan Corbet <corbet@....net>,
        Wanpeng Li <wanpengli@...cent.com>, x86@...nel.org,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] x86/kvm/async_pf: Use separate percpu variable to
 track the enabling of asyncpf

On Mon, Oct 30, 2023, Xiaoyao Li wrote:
> On 10/25/2023 10:22 PM, Sean Christopherson wrote:
> > On Wed, Oct 25, 2023, Vitaly Kuznetsov wrote:
> > > Xiaoyao Li <xiaoyao.li@...el.com> writes:
> > > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > > > index b8ab9ee5896c..388a3fdd3cad 100644
> > > > --- a/arch/x86/kernel/kvm.c
> > > > +++ b/arch/x86/kernel/kvm.c
> > > > @@ -65,6 +65,7 @@ static int __init parse_no_stealacc(char *arg)
> > > >   early_param("no-steal-acc", parse_no_stealacc);
> > > > +static DEFINE_PER_CPU_READ_MOSTLY(bool, async_pf_enabled);
> > > 
> > > Would it make a difference is we replace this with a cpumask? I realize
> > > that we need to access it on all CPUs from hotpaths but this mask will
> > > rarely change so maybe there's no real perfomance hit?
> > 
> > FWIW, I personally prefer per-CPU booleans from a readability perspective.  I
> > doubt there is a meaningful performance difference for a bitmap vs. individual
> > booleans, the check is already gated by a static key, i.e. kernels that are NOT
> > running as KVM guests don't care.
> 
> I agree with it.
> 
> > Actually, if there's performance gains to be had, optimizing kvm_read_and_reset_apf_flags()
> > to read the "enabled" flag if and only if it's necessary is a more likely candidate.
> > Assuming the host isn't being malicious/stupid, then apf_reason.flags will be '0'
> > if PV async #PFs are disabled.  The only question is whether or not apf_reason.flags
> > is predictable enough for the CPU.
> > 
> > Aha!  In practice, the CPU already needs to resolve a branch based on apf_reason.flags,
> > it's just "hidden" up in __kvm_handle_async_pf().
> > 
> > If we really want to micro-optimize, provide an __always_inline inner helper so
> > that __kvm_handle_async_pf() doesn't need to make a CALL just to read the flags.
> > Then in the common case where a #PF isn't due to the host swapping out a page,
> > the paravirt happy path doesn't need a taken branch and never reads the enabled
> > variable.  E.g. the below generates:
> 
> If this is wanted. It can be a separate patch, irrelevant with this series,
> I think.

Yes, it's definitely beyond the scope of this series.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ