lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 18 Sep 2020 21:01:53 -0300
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     John Hubbard <jhubbard@...dia.com>
Cc:     Peter Xu <peterx@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Leon Romanovsky <leonro@...dia.com>,
        Linux-MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "Maya B . Gokhale" <gokhale2@...l.gov>,
        Yang Shi <yang.shi@...ux.alibaba.com>,
        Marty Mcfadden <mcfadden8@...l.gov>,
        Kirill Shutemov <kirill@...temov.name>,
        Oleg Nesterov <oleg@...hat.com>, Jann Horn <jannh@...gle.com>,
        Jan Kara <jack@...e.cz>, Kirill Tkhai <ktkhai@...tuozzo.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Christoph Hellwig <hch@....de>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

On Fri, Sep 18, 2020 at 02:06:23PM -0700, John Hubbard wrote:
> On 9/18/20 1:40 PM, Peter Xu wrote:
> > On Fri, Sep 18, 2020 at 02:32:40PM -0300, Jason Gunthorpe wrote:
> > > On Fri, Sep 18, 2020 at 12:40:32PM -0400, Peter Xu wrote:
> > > 
> > > > Firstly in the draft patch mm->has_pinned is introduced and it's written to 1
> > > > as long as FOLL_GUP is called once.  It's never reset after set.
> > > 
> > > Worth thinking about also adding FOLL_LONGTERM here, at last as long
> > > as it is not a counter. That further limits the impact.
> > 
> > But theoritically we should also trigger COW here for pages even with PIN &&
> > !LONGTERM, am I right?  Assuming that FOLL_PIN is already a corner case.
> > 
> 
> This note, plus Linus' comment about "I'm a normal process, I've never
> done any special rdma page pinning", has me a little worried. Because
> page_maybe_dma_pinned() is counting both short- and long-term pins,
> actually. And that includes O_DIRECT callers.
> 
> O_DIRECT pins are short-term, and RDMA systems are long-term (and should
> be setting FOLL_LONGTERM). But there's no way right now to discern
> between them, once the initial pin_user_pages*() call is complete. All
> we can do today is to count the number of FOLL_PIN calls, not the number
> of FOLL_PIN | FOLL_LONGTERM calls.

My thinking is to hit this issue you have to already be doing
FOLL_LONGTERM, and if some driver hasn't been properly marked and
regresses, the fix is to mark it.

Remember, this use case requires the pin to extend after a system
call, past another fork() system call, and still have data-coherence.

IMHO that can only happen in the FOLL_LONGTERM case as it inhernetly
means the lifetime of the pin is being controlled by userspace, not by
the kernel. Otherwise userspace could not cause new DMA touches after
fork.

Explaining it like that makes me pretty confident it is the right
thing to do, at least for a single bit.

Yes, if we figure out how to do a counter, then the counter can be
everything, but for now, as a rc regression fix, let us limit the
number of impacted cases. Don't need to worry about the unpin problem
because it is never undone.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ