linux-kernel - Re: [PATCH 00/23] KVM: MMU: MMU role refactoring

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALzav=d05sMd=ARkV+GMf9SkxKcg9c9n5ttb274M2fZrP27PDA@mail.gmail.com>
Date:   Mon, 7 Feb 2022 15:53:03 -0800
From:   David Matlack <dmatlack@...gle.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        kvm list <kvm@...r.kernel.org>,
        Vitaly Kuznetsov <vkuznets@...hat.com>
Subject: Re: [PATCH 00/23] KVM: MMU: MMU role refactoring

On Mon, Feb 7, 2022 at 3:27 PM Sean Christopherson <seanjc@...gle.com> wrote:
>
> On Mon, Feb 07, 2022, David Matlack wrote:
> > On Fri, Feb 04, 2022 at 06:56:55AM -0500, Paolo Bonzini wrote:
> > > The TDP MMU has a performance regression compared to the legacy
> > > MMU when CR0 changes often.  This was reported for the grsecurity
> > > kernel, which uses CR0.WP to implement kernel W^X.  In that case,
> > > each change to CR0.WP unloads the MMU and causes a lot of unnecessary
> > > work.  When running nested, this can even cause the L1 to hardly
> > > make progress, as the L0 hypervisor it is overwhelmed by the amount
> > > of MMU work that is needed.
> > >
> > > The root cause of the issue is that the "MMU role" in KVM is a mess
> > > that mixes the CPU setup (CR0/CR4/EFER, SMM, guest mode, etc.)
> > > and the shadow page table format.  Whenever something is different
> > > between the MMU and the CPU, it is stored as an extra field in struct
> > > kvm_mmu---and for extra bonus complication, sometimes the same thing
> > > is stored in both the role and an extra field.
> > >
> > > So, this is the "no functional change intended" part of the changes
> > > required to fix the performance regression.  It separates neatly
> > > the shadow page table format ("MMU role") from the guest page table
> > > format ("CPU role"), and removes the duplicate fields.
> >
> > What do you think about calling this the guest_role instead of cpu_role?
> > There is a bit of a precedent for using "guest" instead of "cpu" already
> > for this type of concept (e.g. guest_walker), and I find it more
> > intuitive.
>
> Haven't looked at the series yet, but I'd prefer not to use guest_role, it's
> too similar to is_guest_mode() and kvm_mmu_role.guest_mode.  E.g. we'd end up with
>
>   static union kvm_mmu_role kvm_calc_guest_role(struct kvm_vcpu *vcpu,
>                                               const struct kvm_mmu_role_regs *regs)
>   {
>         union kvm_mmu_role role = {0};
>
>         role.base.access = ACC_ALL;
>         role.base.smm = is_smm(vcpu);
>         role.base.guest_mode = is_guest_mode(vcpu);
>         role.base.direct = !____is_cr0_pg(regs);
>
>         ...
>   }
>
> and possibly
>
>         if (guest_role.guest_mode)
>                 ...
>
> which would be quite messy.  Maybe vcpu_role if cpu_role isn't intuitive?

I agree it's a little odd. But actually it's somewhat intuitive (the
guest is in guest-mode, i.e. we're running a nested guest).

Ok I'm stretching a little bit :). But if the trade-off is just
"guest_role.guest_mode" requires a clarifying comment, but the rest of
the code gets more readable (cpu_role is used a lot more than
role.guest_mode), it still might be worth it.