linux-kernel - Re: [PATCH] KVM: check userspace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <98d2c1c4ad3e23def8f9a7c71df6b90217b42a88.camel@redhat.com>
Date:   Thu, 11 Jun 2020 23:11:47 +0300
From:   Maxim Levitsky <mlevitsk@...hat.com>
To:     Paolo Bonzini <pbonzini@...hat.com>, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH] KVM: check userspace_addr for all memslots

On Thu, 2020-06-11 at 17:27 +0200, Paolo Bonzini wrote:
> On 11/06/20 16:44, Maxim Levitsky wrote:
> > On Mon, 2020-06-01 at 04:21 -0400, Paolo Bonzini wrote:
> > > The userspace_addr alignment and range checks are not performed for private
> > > memory slots that are prepared by KVM itself.  This is unnecessary and makes
> > > it questionable to use __*_user functions to access memory later on.  We also
> > > rely on the userspace address being aligned since we have an entire family
> > > of functions to map gfn to pfn.
> > > 
> > > Fortunately skipping the check is completely unnecessary.  Only x86 uses
> > > private memslots and their userspace_addr is obtained from vm_mmap,
> > > therefore it must be below PAGE_OFFSET.  In fact, any attempt to pass
> > > an address above PAGE_OFFSET would have failed because such an address
> > > would return true for kvm_is_error_hva.
> > > 
> > > Reported-by: Linus Torvalds <torvalds@...ux-foundation.org>
> > > Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> > 
> > I bisected this patch to break a VM on my AMD system (3970X)
> > 
> > The reason it happens, is because I have avic enabled (which uses
> > a private KVM memslot), but it is permanently disabled for that VM,
> > since I enabled nesting for that VM (+svm) and that triggers the code
> > in __x86_set_memory_region to set userspace_addr of the disabled
> > memslot to non canonical address (0xdeadull << 48) which is later rejected in __kvm_set_memory_region
> > after that patch, and that makes it silently not disable the memslot, which hangs the guest.
> > 
> > The call is from avic_update_access_page, which is called from svm_pre_update_apicv_exec_ctrl
> > which discards the return value.
> > 
> > 
> > I think that the fix for this would be to either make access_ok always return
> > true for size==0, or __kvm_set_memory_region should treat size==0 specially
> > and skip that check for it.
> 
> Or just set hva to 0.  Deletion goes through kvm_delete_memslot so that
> dummy hva is not used anywhere.  If we really want to poison the hva of
> deleted memslots we should not do it specially in
> __x86_set_memory_region.  I'll send a patch.

After checking exactly what access_ok does, I mostly agree with this.
There is still an implicit assumption that address 0 is a valid userspace address.
It is fair to assume that on x86 though.

Best regards,
	Maxim Levitsky

> 
> Paolo
>