lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131216171214.GA15663@sgi.com>
Date:	Mon, 16 Dec 2013 11:12:15 -0600
From:	Alex Thorlton <athorlton@....com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-mm@...ck.org,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Rik van Riel <riel@...hat.com>,
	Wanpeng Li <liwanp@...ux.vnet.ibm.com>,
	Mel Gorman <mgorman@...e.de>,
	Michel Lespinasse <walken@...gle.com>,
	Benjamin LaHaise <bcrl@...ck.org>,
	Oleg Nesterov <oleg@...hat.com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Andy Lutomirski <luto@...capital.net>,
	Al Viro <viro@...iv.linux.org.uk>,
	David Rientjes <rientjes@...gle.com>,
	Zhang Yanfei <zhangyanfei@...fujitsu.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Michal Hocko <mhocko@...e.cz>,
	Jiang Liu <jiang.liu@...wei.com>,
	Cody P Schafer <cody@...ux.vnet.ibm.com>,
	Glauber Costa <glommer@...allels.com>,
	Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	linux-kernel@...r.kernel.org,
	Andrea Arcangeli <aarcange@...hat.com>
Subject: Re: [RFC PATCH 0/3] Change how we determine when to hand out THPs

> Please cc Andrea on this.

I'm going to clean up a few small things for a v2 pretty soon, I'll be
sure to cc Andrea there.

> > My proposed solution to the problem is to allow users to set a
> > threshold at which THPs will be handed out.  The idea here is that, when
> > a user faults in a page in an area where they would usually be handed a
> > THP, we pull 512 pages off the free list, as we would with a regular
> > THP, but we only fault in single pages from that chunk, until the user
> > has faulted in enough pages to pass the threshold we've set.  Once they
> > pass the threshold, we do the necessary work to turn our 512 page chunk
> > into a proper THP.  As it stands now, if the user tries to fault in
> > pages from different nodes, we completely give up on ever turning a
> > particular chunk into a THP, and just fault in the 4K pages as they're
> > requested.  We may want to make this tunable in the future (i.e. allow
> > them to fault in from only 2 different nodes).
> 
> OK.  But all 512 pages reside on the same node, yes?  Whereas with thp
> disabled those 512 pages would have resided closer to the CPUs which
> instantiated them.  

As it stands right now, yes, since we're pulling a 512 page contiguous
chunk off the free list, everything from that chunk will reside on the
same node, but as I (stupidly) forgot to mention in my original e-mail,
one piece I have yet to add is the functionality to put the remaining
unfaulted pages from our chunk *back* on the free list after we give up
on handing out a THP.  Once this is in there, things will behave more
like they do when THP is turned completely off, i.e. pages will get
faulted in closer to the CPU that first referenced them once we give up
on handing out the THP.

> So the expected result will be somewhere in between
> the 93 secs and the 76 secs?

Yes.  Due to the time it takes to search for the temporary THP, I'm sure
we won't get down to 76 secs, but hopefully we'll get close.  I'm also
considering switching the linked list that stores the temporary THPs
over to an rbtree to make that search faster, just fyi.

> That being said, I don't see a downside to the idea, apart from some
> additional setup cost in kernel code.

Good to hear.  I still need to address some of the issues that others
have raised, and finish up the few pieces that aren't fully
working/finished.  I'll get things polished up and get some more
informative test results out soon.

Thanks for looking at the patch!

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ