lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <100D68C7BA14664A8938383216E40DE04062DEA1@FMSMSX114.amr.corp.intel.com>
Date:	Tue, 18 Feb 2014 14:15:59 +0000
From:	"Wilcox, Matthew R" <matthew.r.wilcox@...el.com>
To:	Rik van Riel <riel@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mgorman@...e.de>, Andi Kleen <ak@...ux.intel.com>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Dave Chinner <david@...morbit.com>,
	linux-mm <linux-mm@...ck.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: RE: [RFC, PATCHv2 0/2] mm: map few pages around fault address if
 they are in page cache

We don't really need to lock all the pages being returned to protect against truncate.  We only need to lock the one at the highest index, and check i_size while that lock is held since truncate_inode_pages_range() will block on any page that is locked.

We're still vulnerable to holepunches, but there's no locking currently between holepunches and truncate, so we're no worse off now.
________________________________________
From: Rik van Riel [riel@...hat.com]
Sent: February 18, 2014 5:28 AM
To: Linus Torvalds; Kirill A. Shutemov
Cc: Andrew Morton; Mel Gorman; Andi Kleen; Wilcox, Matthew R; Dave Hansen; Alexander Viro; Dave Chinner; linux-mm; linux-fsdevel; Linux Kernel Mailing List
Subject: Re: [RFC, PATCHv2 0/2] mm: map few pages around fault address if they are in page cache

On 02/17/2014 02:01 PM, Linus Torvalds wrote:

>  - increment the page _mapcount (iow, do "page_add_file_rmap()"
> early). This guarantees that any *subsequent* unmap activity on this
> page will walk the file mapping lists, and become serialized by the
> page table lock we hold.
>
>  - mb_after_atomic_inc() (this is generally free)
>
>  - test that the page is still unlocked and uptodate, and the page
> mapping still points to our page.
>
>  - if that is true, we're all good, we can use the page, otherwise we
> decrement the mapcount (page_remove_rmap()) and skip the page.
>
> Hmm? Doing something like this means that we would never lock the
> pages we prefault, and you can go back to your gang lookup rather than
> that "one page at a time". And the race case is basically never going
> to trigger.
>
> Comments?

What would the direct io code do when it runs into a page with
elevated mapcount, but for which a mapping cannot be found yet?

Looking at the code, it looks like the above scheme could cause
some trouble with invalidate_inode_pages2_range(), which has
the following sequence:

                        if (page_mapped(page)) {
                                ... unmap page
                        }
                        BUG_ON(page_mapped(page));

In other words, it looks like incrementing _mapcount first could
lead to an oops in the truncate and direct IO code.

The page lock is used to prevent such races.

*sigh*

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ