linux-kernel - Re: [PATCH v3 3/8] KVM: x86/mmu: Rename NX huge pages fields/functions for consistency

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Yv/Wu46A98nz57YQ@google.com>
Date:   Fri, 19 Aug 2022 18:30:19 +0000
From:   Mingwei Zhang <mizhang@...gle.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, David Matlack <dmatlack@...gle.com>,
        Yan Zhao <yan.y.zhao@...el.com>,
        Ben Gardon <bgardon@...gle.com>
Subject: Re: [PATCH v3 3/8] KVM: x86/mmu: Rename NX huge pages
 fields/functions for consistency

On Thu, Aug 18, 2022, Sean Christopherson wrote:
> On Thu, Aug 18, 2022, Mingwei Zhang wrote:
> > On Wed, Aug 17, 2022, Sean Christopherson wrote:
> > > Yes, they are shadow pages that the NX recovery thread should zap, but the reason
> > > they should be zapped is because (a) the shadow page has at least one execute child
> > > SPTE, (b) zapping the shadow page will also zap its child SPTEs, and (c) eliminating
> > > all executable child SPTEs means KVM _might_ be able to instantiate an NX huge page.
> > > 
> > 
> > oh, I scratched my head and finaly got your point. hmm. So the shadow
> > pages are the 'blockers' to (re)create a NX huge page because of at
> > least one present child executable spte. So, really, whether these
> > shadow pages themselves are NX huge or not does not really matter. All
> > we need to know is that they will be zapped in the future to help making
> > recovery of an NX huge page possible.
> 
> More precisely, we want to zap shadow pages with executable children if and only
> if they can _possibly_ be replaced with an NX huge page.  The "possibly" is saying
> that zapping _may or may not_ result in an NX huge page.  And it also conveys that
> pages that _cannot_ be replaced with an NX huge page are not on the list.
> 
> If the guest is still using any of the huge page for execution, then KVM can't
> create an NX huge page (or it may temporarily create one and then zap it when the
> gets takes an executable fault), but KVM can't know that until it zaps and the
> guest takes a fault.  Thus, possibly.
> 

Right, I think 'possible' is definitely a correct name for that. In
general, using 'possible' can cover the complexity to ensure the
description is correct. My only comment here is that 'possible_' might
requires extra comments in the code to be more developer friendly.

But overall, since I already remembered what was the problem. I no
longer think this naming is an issue to me. But just that the name could
be better.

> > With that, since you already mentioned the name:
> > 'mmu_pages_that_can_possibly_be_replaced_by_nx_huge_pages',
> > why can't we shorten it by using 'mmu_pages_to_recover_nx_huge' or
> > 'pages_to_recover_nx_huge'? 'recover' is the word that immediately
> > connects with the 'recovery thread', which I think makes more sense on
> > readability.
> 
> mmu_pages_to_recover_nx_huge doesn't capture that recovery isn't guaranteed.
> IMO it also does a poor job of capturing _why_ pages are on the list, i.e. a
> reader knows they are pages that will be "recovered", but it doesn't clarify that
> they'll be recovered/zapped because KVM might be able to be replace them with NX
> huge pages.  In other words, it doesn't help the reader understand why some, but
> not all, nx_huge_page_disallowed are on the recovery list.

I think you are right that the name does not call out 'why' the pages
are on the list. But on the other hand, I am not sure how much it could
help clarifying the situations by just reading the list name. I would
propose we add the conditions using the (flag, list).

(nx_huge_page_disallowed, possible_nx_huge_pages)

case (true,  in_list):     mitigation for multi-hit iTLB.
case (true,  not_in_list): dirty logging disabled; address misalignment; guest did not turn on paging.
case (false, in_list):     not possible.
case (false, not_in_list): Any other situation where KVM manipulate SPTEs.

Maybe this should be in the commit message of the previous patch.