[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZK87jGkrc9/LVsWz@google.com>
Date: Wed, 12 Jul 2023 16:47:24 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Like Xu <like.xu.linux@...il.com>
Cc: Luiz Capitulino <luizcap@...zon.com>,
Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Li RongQing <lirongqing@...du.com>,
Yong He <zhuangel570@...il.com>,
Robert Hoo <robert.hoo.linux@...il.com>,
Kai Huang <kai.huang@...el.com>
Subject: Re: [PATCH] KVM: x86/mmu: Add "never" option to allow sticky
disabling of nx_huge_pages
On Wed, Jul 12, 2023, Like Xu wrote:
> On 2023/6/15 03:07, Sean Christopherson wrote:
> > On Wed, Jun 14, 2023, Luiz Capitulino wrote:
> > > > Applied to kvm-x86 mmu. I kept the default as "auto" for now, as that can go on
> > > > top and I don't want to introduce that change this late in the cycle. If no one
> > > > beats me to the punch (hint, hint ;-) ), I'll post a patch to make "never" the
> > > > default for unaffected hosts so that we can discuss/consider that change for 6.6.
> > >
> > > Thanks Sean, I agree with the plan. I could give a try on the patch if you'd like.
> >
> > Yes please, thanks!
>
> As a KVM/x86 *feature*, playing with splitting and reconstructing large
> pages have other potential user scenarios, e.g. for performance test
> comparisons in a easier approach, not just for itlb_multihit mitigation.
Enabling and disabling dirty logging is a far better tool for that, as it gives
userspace much more explicit control over what pages are are split/reconstituted,
and when.
> On unaffected machines (ICX and later), nx_huge_pages is already "N",
> and turning it into "never" doesn't help materially in the mitigation
> implementation, but loses flexibility.
I'm becoming more and more convinced that losing the flexibility is perfectly
acceptable. There's a very good argument to be made that mitigating DoS attacks
from the guest kernel should be done several levels up, e.g. by refusing to create
VMs for a customer that is bringing down hosts. As Jim has a pointed out, plugging
the hole only works if you are 100% confident there are no other holes, and will
never be other holes.
> IMO, the real issue here is that the kernel thread "kvm-nx-lpage-
> recovery" is created unconditionally. We also need to be aware of the
> existence of this commit 084cc29f8bbb ("KVM: x86/MMU: Allow NX huge
> pages to be disabled on a per-vm basis").
>
> One of the technical proposals is to defer kvm_vm_create_worker_thread()
> to kvm_mmu_create() or kvm_init_mmu(), based on
> kvm->arch.disable_nx_huge_pages, even until guest paging mode is enabled
> on the first vcpu.
>
> Is this step worth taking ?
IMO, no. In hindsight, adding KVM_CAP_VM_DISABLE_NX_HUGE_PAGES was likely a
mistake; requiring CAP_SYS_BOOT makes it annoyingly difficult to safely use the
capability. My preference at this point is to make changes to the NX hugepage
mitigation only when there is a substantial benefit to an already-deployed usecase.
Powered by blists - more mailing lists