linux-kernel - Re: [PATCH v3 3/8] KVM: x86/mmu: Rename NX huge pages fields/functions for consistency

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Yv7PHx2qSB0PwkP/@google.com>
Date:   Thu, 18 Aug 2022 23:45:35 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Mingwei Zhang <mizhang@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, David Matlack <dmatlack@...gle.com>,
        Yan Zhao <yan.y.zhao@...el.com>,
        Ben Gardon <bgardon@...gle.com>
Subject: Re: [PATCH v3 3/8] KVM: x86/mmu: Rename NX huge pages
 fields/functions for consistency

On Thu, Aug 18, 2022, Mingwei Zhang wrote:
> On Wed, Aug 17, 2022, Sean Christopherson wrote:
> > Yes, they are shadow pages that the NX recovery thread should zap, but the reason
> > they should be zapped is because (a) the shadow page has at least one execute child
> > SPTE, (b) zapping the shadow page will also zap its child SPTEs, and (c) eliminating
> > all executable child SPTEs means KVM _might_ be able to instantiate an NX huge page.
> > 
> 
> oh, I scratched my head and finaly got your point. hmm. So the shadow
> pages are the 'blockers' to (re)create a NX huge page because of at
> least one present child executable spte. So, really, whether these
> shadow pages themselves are NX huge or not does not really matter. All
> we need to know is that they will be zapped in the future to help making
> recovery of an NX huge page possible.

More precisely, we want to zap shadow pages with executable children if and only
if they can _possibly_ be replaced with an NX huge page.  The "possibly" is saying
that zapping _may or may not_ result in an NX huge page.  And it also conveys that
pages that _cannot_ be replaced with an NX huge page are not on the list.

If the guest is still using any of the huge page for execution, then KVM can't
create an NX huge page (or it may temporarily create one and then zap it when the
gets takes an executable fault), but KVM can't know that until it zaps and the
guest takes a fault.  Thus, possibly.

> > > `nx_huge_page_disallowed` is easy to understand because it literally say
> > > 'nx_huge_page is not allowed', which is correct.
> > 
> > No, it's not correct.  The list isn't simply the set of shadow pages that disallow
> > NX huge pages, it's the set of shadow pages that disallow NX huge pages _and_ that
> > can possibly be replaced by an NX huge page if the shadow page and all its
> > (executable) children go away.
> > 
> 
> hmm, I think this naming is correct. The flag is used to talk to the
> 'fault handler' to say 'hey, don't create nx huge page, stupid'. Of
> course, it is also used to by the 'nx huge recovery thread', but the
> recovery thread will only check it for sanity purpose, which really does
> not matter, i.e., the thread will zap the pages anyway.

Ah, sorry, I thought you were suggesting "nx_huge_page_disallowed" for the list
name, but you were talking about the flag.  Yes, 100% agree that the flag is
appropriately named.

> > > But this one, it says 'possible nx_huge_pages', but they are not
> > > nx huge pages at all.
> > 
> > Yes, but they _can be_ NX huge pages, hence the "possible".  A super verbose name
> > would be something like mmu_pages_that_can_possibly_be_replaced_by_nx_huge_pages.
> > 
> 
> I can make a dramtic example as why 'possible' may not help:
> 
> /* Flag that decides something important. */
> bool possible_one;
> 
> The information we (readers) gain from reading the above is _0_.

But that's only half the story.  If we also had an associated flag

  bool one_disallowed;

a.k.a. nx_huge_page_disallowed, then when viewed together, readers know that the
existince of this struct disallows "one", but that structs with one_disallowed=true
_and_ possible_one=true _might_ be converted to "one", whereas structs with
possible_one=false _cannot_ be converted to "one".

> With that, since you already mentioned the name:
> 'mmu_pages_that_can_possibly_be_replaced_by_nx_huge_pages',
> why can't we shorten it by using 'mmu_pages_to_recover_nx_huge' or
> 'pages_to_recover_nx_huge'? 'recover' is the word that immediately
> connects with the 'recovery thread', which I think makes more sense on
> readability.

mmu_pages_to_recover_nx_huge doesn't capture that recovery isn't guaranteed.
IMO it also does a poor job of capturing _why_ pages are on the list, i.e. a
reader knows they are pages that will be "recovered", but it doesn't clarify that
they'll be recovered/zapped because KVM might be able to be replace them with NX
huge pages.  In other words, it doesn't help the reader understand why some, but
not all, nx_huge_page_disallowed are on the recovery list.