lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPm50aJjZhTWZVMj6FVtOP70ZuSVPrHPqFvVors1NmJ+8SYVQw@mail.gmail.com>
Date:   Thu, 7 Sep 2023 09:27:32 +0800
From:   Hao Peng <flyingpenghao@...il.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Xiaoyao Li <xiaoyao.li@...el.com>, pbonzini@...hat.com,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] KVM: X86: Reduce calls to vcpu_load

On Thu, Sep 7, 2023 at 4:08 AM Sean Christopherson <seanjc@...gle.com> wrote:
>
> On Wed, Sep 06, 2023, Xiaoyao Li wrote:
> > On 9/6/2023 2:24 PM, Hao Peng wrote:
> > > From: Peng Hao <flyingpeng@...cent.com>
> > >
> > > The call of vcpu_load/put takes about 1-2us. Each
> > > kvm_arch_vcpu_create will call vcpu_load/put
> > > to initialize some fields of vmcs, which can be
> > > delayed until the call of vcpu_ioctl to process
> > > this part of the vmcs field, which can reduce calls
> > > to vcpu_load.
> >
> > what if no vcpu ioctl is called after vcpu creation?
> >
> > And will the first (it was second before this patch) vcpu_load() becomes
> > longer? have you measured it?
>
> I don't think the first vcpu_load() becomes longer, this avoids an entire
> load()+put() pair by doing the initialization in the first ioctl().
>
> That said, the patch is obviously buggy, it hooks kvm_arch_vcpu_ioctl() instead
> of kvm_vcpu_ioctl(), e.g. doing KVM_RUN, KVM_SET_SREGS, etc. will cause explosions.
>
> It will also break the TSC synchronization logic in kvm_arch_vcpu_postcreate(),
> which can "race" with ioctls() as the vCPU file descriptor is accessible by
> userspace the instant it's installed into the fd tables, i.e. userspace doesn't
> have to wait for KVM_CREATE_VCPU to complete.
>
It works when there are many cores. The hook point problem mentioned
above can still be adjusted,
but the tsc synchronization problem is difficult to deal with.
thanks.
> And I gotta imagine there are other interactions I haven't thought of off the
> top of my head, e.g. the vCPU is also reachable via kvm_for_each_vcpu().  All it
> takes is one path that touches a lazily initialized field for this to fall apart.
>
> > I don't think it worth the optimization unless a strong reason.
>
> Yeah, this is a lot of subtle complexity to shave 1-2us.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ