lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0C489207-F1C0-4D54-A55D-0983229F79E1@amazon.de>
Date:   Wed, 12 Apr 2017 13:16:50 +0000
From:   "Sironi, Filippo" <sironi@...zon.de>
To:     Radim Krčmář <rkrcmar@...hat.com>
CC:     "Sironi, Filippo" <sironi@...zon.de>,
        "Liguori, Anthony" <aliguori@...zon.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86, kvm: Handle PFNs outside of kernel reach when
 touching GPTEs

Thanks for taking the time and sorry for the delay.

> On 6. Apr 2017, at 16:22, Radim Krčmář <rkrcmar@...hat.com> wrote:
> 
> 2017-04-05 15:07+0200, Filippo Sironi:
>> cmpxchg_gpte() calls get_user_pages_fast() to retrieve the number of
>> pages and the respective struct pages for mapping in the kernel virtual
>> address space.
>> This doesn't work if get_user_pages_fast() is invoked with a userspace
>> virtual address that's backed by PFNs outside of kernel reach (e.g.,
>> when limiting the kernel memory with mem= in the command line and using
>> /dev/mem to map memory).
>> 
>> If get_user_pages_fast() fails, look up the VMA that backs the userspace
>> virtual address, compute the PFN and the physical address, and map it in
>> the kernel virtual address space with memremap().
> 
> What is the reason for a configuration that voluntarily restricts access
> to memory that it needs?

By using /dev/mem to provide VM memory, one can avoid the overhead of allocating struct page(s) for the whole memory, which is wasteful when using a server entirely for hosting VMs.

>> Signed-off-by: Filippo Sironi <sironi@...zon.de>
>> Cc: Anthony Liguori <aliguori@...zon.com>
>> Cc: kvm@...r.kernel.org
>> Cc: linux-kernel@...r.kernel.org
>> ---
>> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
>> @@ -147,15 +147,36 @@ static int FNAME(cmpxchg_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
>> 	struct page *page;
>> 
>> 	npages = get_user_pages_fast((unsigned long)ptep_user, 1, 1, &page);
>> -	/* Check if the user is doing something meaningless. */
>> -	if (unlikely(npages != 1))
>> -		return -EFAULT;
>> -
>> -	table = kmap_atomic(page);
>> -	ret = CMPXCHG(&table[index], orig_pte, new_pte);
>> -	kunmap_atomic(table);
>> -
>> -	kvm_release_page_dirty(page);
>> +	if (likely(npages == 1)) {
>> +		table = kmap_atomic(page);
>> +		ret = CMPXCHG(&table[index], orig_pte, new_pte);
>> +		kunmap_atomic(table);
>> +
>> +		kvm_release_page_dirty(page);
>> +	} else {
>> +		struct vm_area_struct *vma;
>> +		unsigned long vaddr = (unsigned long)ptep_user & PAGE_MASK;
>> +		unsigned long pfn;
>> +		unsigned long paddr;
>> +
>> +		down_read(&current->mm->mmap_sem);
>> +		vma = find_vma_intersection(current->mm, vaddr,
>> +					    vaddr + PAGE_SIZE);
> 
> Hm, with the argument order like this, we check that
> 
>  vaddr < vma->vm_end && vaddr + PAGE_SIZE > vma->vm_start
> 
> but shouldn't we actually check that the whole page is there, i.e.
> 
>  vaddr + PAGE_SIZE < vma->vm_end && vaddr > vma->vm_start
> 
> ?
> 
> Thanks.

Hm, right now we check for the following:

    vaddr >= vma->vm_start && vaddr < vma->vm_end && vaddr + PAGE_SIZE > vma->vm_start

given that vaddr is PAGE_SIZE aligned, we're guaranteed that vaddr + PAGE_SIZE <= vma->vm_end.
This seems more complex than necessary. I believe that:

    vma = find_vma(current->mm, vaddr);

should be enough.

>> +		if (!vma || !(vma->vm_flags & VM_PFNMAP)) {
>> +			up_read(&current->mm->mmap_sem);
>> +			return -EFAULT;
>> +		}
>> +		pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
>> +		paddr = pfn << PAGE_SHIFT;
>> +		table = memremap(paddr, PAGE_SIZE, MEMREMAP_WB);
> 
> (I don't undestand why there isn't a wrapper for this ...
> Looks like we're doing something unexpected.)

Do you mean a wrapper for getting the pfn/paddr?

>> +		if (!table) {
>> +			up_read(&current->mm->mmap_sem);
>> +			return -EFAULT;
>> +		}
>> +		ret = CMPXCHG(&table[index], orig_pte, new_pte);
>> +		memunmap(table);
>> +		up_read(&current->mm->mmap_sem);
>> +	}
>> 
>> 	return (ret != orig_pte);
>> }
>> -- 
>> 2.7.4

I'll submit a v2 version soon.

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ