lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210609095953.s6bgfjnwkwvjhfo3@box.shutemov.name>
Date:   Wed, 9 Jun 2021 12:59:53 +0300
From:   "Kirill A. Shutemov" <kirill@...temov.name>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Xu Yu <xuyu@...ux.alibaba.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, hughd@...gle.com,
        akpm@...ux-foundation.org, gavin.dg@...ux.alibaba.com
Subject: Re: [PATCH v2] mm, thp: use head page in __migration_entry_wait

On Tue, Jun 08, 2021 at 02:35:23PM +0100, Matthew Wilcox wrote:
> On Tue, Jun 08, 2021 at 03:58:38PM +0300, Kirill A. Shutemov wrote:
> > On Tue, Jun 08, 2021 at 01:32:21PM +0100, Matthew Wilcox wrote:
> > > On Tue, Jun 08, 2021 at 03:00:26PM +0300, Kirill A. Shutemov wrote:
> > > > But there's one quirk: if split succeed we effectively wait on wrong
> > > > page to be unlocked. And it may take indefinite time if split_huge_page()
> > > > was called on the head page.
> > > 
> > > Hardly indefinite time ... callers of split_huge_page_to_list() usually
> > > unlock the page soon after.  Actually, I can't find one that doesn't call
> > > unlock_page() within a few lines of calling split_huge_page_to_list().
> > 
> > I didn't check all callers, but it's not guaranteed by the interface and
> > it's not hard to imagine a future situation when a page got split on the
> > way to IO and kept locked until IO is complete.
> 
> I would say that can't happen.  Pages are locked when added to the page
> cache and are !Uptodate.  You can't put a PTE in a process page table
> until it's Uptodate, and once it's Uptodate, the page is unlocked.  So
> any subsequent locks are transient, and not for the purposes of IO
> (writebacks only take the page lock transiently).

Documentation/filesystems/locking.rst:

	Note, if the filesystem needs the page to be locked during writeout, that
	is ok, too, the page is allowed to be unlocked at any point in time
	between the calls to set_page_writeback() and end_page_writeback().

I probably misinterpret what is written here. I know very little about
writeback path.

> > The wake up shouldn't have much overhead as in most cases split going to
> > be called on the head page.
> 
> I'm not convinced about that.  We go out of our way to not wake up pages
> (eg PageWaiters), and we've had some impressively long lists in the past
> (which is why we now have the bookmarks).

Maybe we should be smarter on when to wake up, I donno.

I just notice that with the change we have /potential/ to wait long time
on the wrong page to be unlocked. split_huge_page() interface doesn't
enforce that the page gets split soon after split is complete.

-- 
 Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ