lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 24 May 2023 09:06:36 +0200
From:   David Hildenbrand <david@...hat.com>
To:     David Howells <dhowells@...hat.com>,
        Christoph Hellwig <hch@...radead.org>
Cc:     Jens Axboe <axboe@...nel.dk>, Al Viro <viro@...iv.linux.org.uk>,
        Matthew Wilcox <willy@...radead.org>, Jan Kara <jack@...e.cz>,
        Jeff Layton <jlayton@...nel.org>,
        Jason Gunthorpe <jgg@...dia.com>,
        Logan Gunthorpe <logang@...tatee.com>,
        Hillf Danton <hdanton@...a.com>,
        Christian Brauner <brauner@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-fsdevel@...r.kernel.org, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: Extending page pinning into fs/direct-io.c

On 23.05.23 22:16, David Howells wrote:
> Christoph Hellwig <hch@...radead.org> wrote:
> 
>> But can you please also take care of the legacy direct I/O code?  I'd really
>> hate to leave yet another unfinished transition around.
> 
> I've been poking at it this afternoon, but it doesn't look like it's going to
> be straightforward, unfortunately.  The mm folks have been withdrawing access
> to the pinning API behind the ramparts of the mm/ dir.  Further, the dio code
> will (I think), under some circumstances, arbitrarily insert the zero_page
> into a list of things that are maybe pinned or maybe unpinned, but I can (I
> think) also be given a pinned zero_page from the GUP code if the page tables
> point to one and a DIO-write is requested - so just doing if page == zero_page
> isn't sufficient.
> 
> What I'd like to do is to make the GUP code not take a ref on the zero_page
> if, say, FOLL_DONT_PIN_ZEROPAGE is passed in, and then make the bio cleanup
> code always ignore the zero_page.

We discussed doing that unconditionally in the context of vfio (below), but vfio
decided to add a workaround suitable for stable.

In case of FOLL_PIN it's simple: if we detect the zeropage, don't mess with the
refcount when pinning and don't mess with the refcount when unpinning (esp.
unpin_user_pages). FOLL_GET is a different story but we don't have to mess with
that.

So there shouldn't be need for a FOLL_DONT_PIN_ZEROPAGE, we could just do it
unconditionally.

> 
> Alternatively, I can drop the pin immediately if I get given one on the
> zero_page - it's not going anywhere, after all.

That's what vfio did in

commit 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4
Author: Alex Williamson <alex.williamson@...hat.com>
Date:   Mon Aug 29 21:05:40 2022 -0600

     vfio/type1: Unpin zero pages
     
     There's currently a reference count leak on the zero page.  We increment
     the reference via pin_user_pages_remote(), but the page is later handled
     as an invalid/reserved page, therefore it's not accounted against the
     user and not unpinned by our put_pfn().
     
     Introducing special zero page handling in put_pfn() would resolve the
     leak, but without accounting of the zero page, a single user could
     still create enough mappings to generate a reference count overflow.
     
     The zero page is always resident, so for our purposes there's no reason
     to keep it pinned.  Therefore, add a loop to walk pages returned from
     pin_user_pages_remote() and unpin any zero pages.


For vfio that handling no longer required, because FOLL_LONGTERM will never pin
the shared zeropage.

-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ