lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1464034383.16365.70.camel@redhat.com>
Date:	Mon, 23 May 2016 16:13:03 -0400
From:	Rik van Riel <riel@...hat.com>
To:	"Kirill A. Shutemov" <kirill@...temov.name>
Cc:	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Michal Hocko <mhocko@...nel.org>,
	Ebru Akagunduz <ebru.akagunduz@...il.com>, linux-mm@...ck.org,
	hughd@...gle.com, akpm@...ux-foundation.org,
	n-horiguchi@...jp.nec.com, aarcange@...hat.com,
	iamjoonsoo.kim@....com, gorcunov@...nvz.org,
	linux-kernel@...r.kernel.org, mgorman@...e.de, rientjes@...gle.com,
	vbabka@...e.cz, aneesh.kumar@...ux.vnet.ibm.com,
	hannes@...xchg.org, boaz@...xistor.com
Subject: Re: [PATCH 3/3] mm, thp: make swapin readahead under down_read of
 mmap_sem

On Mon, 2016-05-23 at 23:02 +0300, Kirill A. Shutemov wrote:
> On Mon, May 23, 2016 at 03:26:47PM -0400, Rik van Riel wrote:
> > 
> > On Mon, 2016-05-23 at 22:01 +0300, Kirill A. Shutemov wrote:
> > > 
> > > On Mon, May 23, 2016 at 02:49:09PM -0400, Rik van Riel wrote:
> > > > 
> > > > 
> > > > On Mon, 2016-05-23 at 20:42 +0200, Michal Hocko wrote:
> > > > > 
> > > > > 
> > > > > On Mon 23-05-16 20:14:11, Ebru Akagunduz wrote:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Currently khugepaged makes swapin readahead under
> > > > > > down_write. This patch supplies to make swapin
> > > > > > readahead under down_read instead of down_write.
> > > > > You are still keeping down_write. Can we do without it
> > > > > altogether?
> > > > > Blocking mmap_sem of a remote proces for write is certainly
> > > > > not
> > > > > nice.
> > > > Maybe Andrea can explain why khugepaged requires
> > > > a down_write of mmap_sem?
> > > > 
> > > > If it were possible to have just down_read that
> > > > would make the code a lot simpler.
> > > You need a down_write() to retract page table. We need to make
> > > sure
> > > that
> > > nobody sees the page table before we can replace it with huge
> > > pmd.
> > Good point.
> > 
> > I guess the alternative is to have the page_table_lock
> > taken by a helper function (everywhere) that can return
> > failure if the page table was changed while the caller
> > was waiting for the lock.
> Not page table was changed, but pmd is now pointing to something
> else.
> Basically, we would need to nest all pte-ptl's within pmd_lock().
> That's not good for scalability.

I can see a few alternatives here:

1) huge pmd collapsing takes both the pmd lock and the pte lock,
   preventing pte updates from happening simultaneously

2) code that (re-)acquires the pte lock can read a sequence number
   at the pmd level, check that it did not change after the
   pte lock has been acquired, and abort if it has - I believe most
   of the code that re-acquires the pte lock already knows how to
   abort if somebody else touched the pte while it was looking
   elsewhere

That way the (uncommon) thp collapse code should still exclude
pte level operations, at the cost of potentially teaching a few
more pte level operations to abort (chances are most already do,
considering a race with other pte-level manipulations requires that).

-- 
All Rights Reversed.


Download attachment "signature.asc" of type "application/pgp-signature" (474 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ