lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 4 Feb 2021 16:33:08 -0400
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Sean Christopherson <seanjc@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, David Stevens <stevensd@...gle.com>,
        Jann Horn <jannh@...gle.com>, kvm@...r.kernel.org
Subject: Re: [PATCH] mm: Export follow_pte() for KVM so that KVM can stop
 using follow_pfn()

On Thu, Feb 04, 2021 at 06:19:13PM +0100, Paolo Bonzini wrote:
> On 04/02/21 18:16, Sean Christopherson wrote:
> > Export follow_pte() to fix build breakage when KVM is built as a module.
> > An in-flight KVM fix switches from follow_pfn() to follow_pte() in order
> > to grab the page protections along with the PFN.
> > 
> > Fixes: bd2fae8da794 ("KVM: do not assume PTE is writable after follow_pfn")
> > Cc: David Stevens <stevensd@...gle.com>
> > Cc: Jann Horn <jannh@...gle.com>
> > Cc: Jason Gunthorpe <jgg@...pe.ca>
> > Cc: Paolo Bonzini <pbonzini@...hat.com>
> > Cc: kvm@...r.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> > 
> > Paolo, maybe you can squash this with the appropriate acks?
> 
> Indeed, you beat me by a minute.  This change is why I hadn't sent out the
> patch yet.
> 
> Andrew or Jason, ok to squash this?

I think usual process would be to put this in the patch/series/pr that
needs it.

Given how badly follow_pfn has been misused, I would greatly prefer to
see you add a kdoc along with exporting it - making it clear about the
rules.

And it looks like we should remove the range argument for modular use

And document the locking requirements, it does a lockless read of the
page table:

	pgd = pgd_offset(mm, address);
	if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
		goto out;

	p4d = p4d_offset(pgd, address);

It doesn't do the trickery that fast GUP does, so it must require the
mmap sem in read mode at least.

Not sure I understand how fsdax is able to call it only under the
i_mmap_lock_read lock? What prevents a page table level from being
freed concurrently?

And it is missing READ_ONCE's for the lockless page table walk.. :(

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ