lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 26 Oct 2020 11:16:01 -0300
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     John Hubbard <jhubbard@...dia.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        David Rientjes <rientjes@...gle.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Kees Cook <keescook@...omium.org>,
        Will Drewry <wad@...omium.org>,
        "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
        "Kleen, Andi" <andi.kleen@...el.com>,
        Liran Alon <liran.alon@...cle.com>,
        Mike Rapoport <rppt@...nel.org>, x86@...nel.org,
        kvm@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [RFCv2 08/16] KVM: Use GUP instead of copy_from/to_user() to
 access guest memory

On Mon, Oct 26, 2020 at 01:28:30PM +0000, Matthew Wilcox wrote:

> > > It's been five years since DAX was merged, and page pinning still
> > > doesn't work.  How much longer before the people who are pushing it
> > > realise that it's fundamentally flawed?
> > 
> > Is this a separate rant about *only* DAX, or is general RDMA in your sights
> > too? :)
> 
> This is a case where it's not RDMA's _fault_ that there's no good API
> for it to do what it needs to do.  There's a lot of work needed to wean
> Linux device drivers off their assumption that there's a struct page
> for every byte of memory.

People who care seem to have just given up and are using RDMA ODP, so
I'm not optimistic this DAX issue will ever be solved. I've also
almost removed all the struct page references from this flow in RDMA,
so if there is some way that helps it is certainly doable.

Regardless of DAX the pinning indication is now being used during
fork() for some good reasons, and seems to make sense in other use
cases. It just doesn't seem like a way to solve the DAX issue.

More or less it seems to mean that pages pinned cannot be write
protected and more broadly the kernel should not change the PTEs for
those pages independently of the application. ie the more agressive
COW on fork() caused data corruption regressions...

I wonder if the point here is that some page owners can't/won't
support DMA pinning and should just be blocked completely for them.

I'd much rather have write access pin_user_pages() just fail than oops
the kernel on ext4 owned VMAs, for instance.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ