linux-kernel - Re: [PATCH v5 4/4] KVM: mmu: remove over-aggressive warnings

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YdhugJ6h76JLHTjT@google.com>
Date:   Fri, 7 Jan 2022 16:46:56 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     David Stevens <stevensd@...omium.org>
Cc:     Marc Zyngier <maz@...nel.org>, Paolo Bonzini <pbonzini@...hat.com>,
        James Morse <james.morse@....com>,
        Alexandru Elisei <alexandru.elisei@....com>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Will Deacon <will@...nel.org>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        Chia-I Wu <olv@...omium.org>
Subject: Re: [PATCH v5 4/4] KVM: mmu: remove over-aggressive warnings

On Fri, Jan 07, 2022, Sean Christopherson wrote:
> On Fri, Jan 07, 2022, David Stevens wrote:
> > > > These are the type of pages which KVM is currently rejecting. Is this
> > > > something that KVM can support?
> > >
> > > I'm not opposed to it.  My complaint is that this series is incomplete in that it
> > > allows mapping the memory into the guest, but doesn't support accessing the memory
> > > from KVM itself.  That means for things to work properly, KVM is relying on the
> > > guest to use the memory in a limited capacity, e.g. isn't using the memory as
> > > general purpose RAM.  That's not problematic for your use case, because presumably
> > > the memory is used only by the vGPU, but as is KVM can't enforce that behavior in
> > > any way.
> > >
> > > The really gross part is that failures are not strictly punted to userspace;
> > > the resulting error varies significantly depending on how the guest "illegally"
> > > uses the memory.
> > >
> > > My first choice would be to get the amdgpu driver "fixed", but that's likely an
> > > unreasonable request since it sounds like the non-KVM behavior is working as intended.
> > >
> > > One thought would be to require userspace to opt-in to mapping this type of memory
> > > by introducing a new memslot flag that explicitly states that the memslot cannot
> > > be accessed directly by KVM, i.e. can only be mapped into the guest.  That way,
> > > KVM has an explicit ABI with respect to how it handles this type of memory, even
> > > though the semantics of exactly what will happen if userspace/guest violates the
> > > ABI are not well-defined.  And internally, KVM would also have a clear touchpoint
> > > where it deliberately allows mapping such memslots, as opposed to the more implicit
> > > behavior of bypassing ensure_pfn_ref().
> > 
> > Is it well defined when KVM needs to directly access a memslot?
> 
> Not really, there's certainly no established rule.
> 
> > At least for x86, it looks like most of the use cases are related to nested
> > virtualization, except for the call in emulator_cmpxchg_emulated.
> 
> The emulator_cmpxchg_emulated() will hopefully go away in the nearish future[*].

Forgot the link...

https://lore.kernel.org/all/YcG32Ytj0zUAW%2FB2@hirez.programming.kicks-ass.net/

> Paravirt features that communicate between guest and host via memory is the other
> case that often maps a pfn into KVM.
> 
> > Without being able to specifically state what should be avoided, a flag like
> > that would be difficult for userspace to use.
> 
> Yeah :-(  I was thinking KVM could state the flag would be safe to use if and only
> if userspace could guarantee that the guest would use the memory for some "special"
> use case, but hadn't actually thought about how to word things.
> 
> The best thing to do is probably to wait for for kvm_vcpu_map() to be eliminated,
> as described in the changelogs for commits:
> 
>   357a18ad230f ("KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cache")
>   7e2175ebd695 ("KVM: x86: Fix recording of guest steal time / preempted status")
> 
> Once that is done, everything in KVM will either access guest memory through the
> userspace hva, or via a mechanism that is tied into the mmu_notifier, at which
> point accessing non-refcounted struct pages is safe and just needs to worry about
> not corrupting _refcount.