lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210602233353.gxq35yxluhas5knp@box>
Date:   Thu, 3 Jun 2021 02:33:53 +0300
From:   "Kirill A. Shutemov" <kirill@...temov.name>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Jim Mattson <jmattson@...gle.com>,
        David Rientjes <rientjes@...gle.com>,
        "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
        "Kleen, Andi" <andi.kleen@...el.com>,
        "Yamahata, Isaku" <isaku.yamahata@...el.com>,
        Erdem Aktas <erdemaktas@...gle.com>,
        Steve Rutherford <srutherford@...gle.com>,
        Peter Gonda <pgonda@...gle.com>,
        David Hildenbrand <david@...hat.com>,
        Chao Peng <chao.p.peng@...ux.intel.com>, x86@...nel.org,
        kvm@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFCv2 13/13] KVM: unmap guest memory using poisoned pages

On Wed, Jun 02, 2021 at 05:51:02PM +0000, Sean Christopherson wrote:
> > Omitting FOLL_GUEST for shared memory doesn't look like a right approach.
> > IIUC, it would require the kernel to track what memory is share and what
> > private, which defeat the purpose of the rework. I would rather enforce
> > !PageGuest() when share SEPT is populated in addition to enforcing
> > PageGuest() fro private SEPT.
> 
> Isn't that what omitting FOLL_GUEST would accomplish?  For shared memory,
> including mapping memory into the shared EPT, KVM will omit FOLL_GUEST and thus
> require the memory to be readable/writable according to the guest access type.

Ah. I guess I see what you're saying: we can pipe down the shared bit from
GPA from direct_page_fault() (or whatever handles the fault) down to
hva_to_pfn_slow() and omit FOLL_GUEST if the shared bit is set. Right?

I guest it's doable, but codeshuffling going to be ugly.

> By definition, that excludes PageGuest() because PageGuest() pages must always
> be unmapped, e.g. PROTNONE.  And for private EPT, because PageGuest() is always
> PROTNONE or whatever, it will require FOLL_GUEST to retrieve the PTE/PMD/Pxx.
> 
> On a semi-related topic, I don't think can_follow_write_pte() is the correct
> place to hook PageGuest().  TDX's S-EPT has a quirk where all private guest
> memory must be mapped writable, but that quirk doesn't hold true for non-TDX
> guests.  It should be legal to map private guest memory as read-only.

Hm. The point of the change in can_follow_write_pte() is to only allow to
write to a PageGuest() page if FOLL_GUEST is used and the mapping is
writable. Without the change gup(FOLL_GUEST|FOLL_WRITE) would fail.

It doesn't prevent using read-only guest mappings as read-only. But if you
want to write to it it has to writable (in addtion to FOLL_GUEST). 

> And I believe the below snippet in follow_page_pte() will be problematic
> too, since FOLL_NUMA is added unless FOLL_FORCE is set.  I suspect the
> correct approach is to handle FOLL_GUEST as an exception to
> pte_protnone(), though that might require adjusting pte_protnone() to be
> meaningful even when CONFIG_NUMA_BALANCING=n.
> 
> 	if ((flags & FOLL_NUMA) && pte_protnone(pte))
> 		goto no_page;
> 	if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags)) {
> 		pte_unmap_unlock(ptep, ptl);
> 		return NULL;
> 	}

Good catch. I'll look into how to untangle NUMA balancing and PageGuest().
It shouldn't be hard. PageGuest() pages should be subject for balancing.

> > Do you see any problems with this?
> > 
> > > Oh, and the other nicety is that I think it would avoid having to explicitly
> > > handle PageGuest() memory that is being accessed from kernel/KVM, i.e. if all
> > > memory exposed to KVM must be !PageGuest(), then it is also eligible for
> > > copy_{to,from}_user().
> > 
> > copy_{to,from}_user() enforce by setting PTE entries to PROT_NONE.
> 
> But KVM does _not_ want those PTEs PROT_NONE.  If KVM is accessing memory that
> is also accessible by the the guest, then it must be shared.  And if it's shared,
> it must also be accessible to host userspace, i.e. something other than PROT_NONE,
> otherwise the memory isn't actually shared with anything.
> 
> As above, any guest-accessible memory that is accessed by the host must be
> shared, and so must be mapped with the required permissions.

I don't see contradiction here: copy_{to,from}_user() would fail with
-EFAULT on PROT_NONE PTE.

By saying in initial posting that inserting PageGuest() into shared is
fine, I didn't mean it's usefule, just allowed.

-- 
 Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ