[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20131225190747.GB195633@sgi.com>
Date: Wed, 25 Dec 2013 13:07:47 -0600
From: Alex Thorlton <athorlton@....com>
To: Andrea Arcangeli <aarcange@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Rik van Riel <riel@...hat.com>,
Wanpeng Li <liwanp@...ux.vnet.ibm.com>,
Mel Gorman <mgorman@...e.de>,
Michel Lespinasse <walken@...gle.com>,
Benjamin LaHaise <bcrl@...ck.org>,
Oleg Nesterov <oleg@...hat.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Andy Lutomirski <luto@...capital.net>,
Al Viro <viro@...iv.linux.org.uk>,
David Rientjes <rientjes@...gle.com>,
Zhang Yanfei <zhangyanfei@...fujitsu.com>,
Peter Zijlstra <peterz@...radead.org>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.cz>,
Jiang Liu <jiang.liu@...wei.com>,
Cody P Schafer <cody@...ux.vnet.ibm.com>,
Glauber Costa <glommer@...allels.com>,
Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/3] Change how we determine when to hand out THPs
On Tue, Dec 17, 2013 at 06:55:00PM +0100, Andrea Arcangeli wrote:
> On Tue, Dec 17, 2013 at 10:20:07AM -0600, Alex Thorlton wrote:
> > This message in particular:
> >
> > https://lkml.org/lkml/2013/8/2/697
>
> I think adding a prctl (or similar) inherited by child to turn off THP
> would be a fine addition to the current madvise. So you can then run
> any static app under a wrapper like "THP_disable ./whatever"
>
> The idea is, if the software is maintained, madvise allows for
> finegrined optimization, if the software is legacy proprietary
> statically linked (or if it already uses LD_PRELOAD for other things),
> prctl takes care of that in a more coarse way (but still per-app).
That sounds fine. I'll dig up the old patches that I wrote a while back
to enable this, and get them cleaned up and rebased to the latest kernel
version for people to review.
> > The thread I mention above originally proposed a per-process switch to
> > disable THP without the use of madvise, but it was not very well
> > received. I'm more than willing to revisit that idea, and possibly
>
> I think you provided enough explanation of why it is needed (static
> binaries, proprietary apps, annoyance of LD_PRELOAD that may collide
> with other LD_PRELOAD in proprietary apps whatever), so I think a
> prctl is reasonable addition to the madvise.
>
> We also have an madvise to turn on THP selectively on embedded that
> may boot with enabled=madvise to be sure not to waste any memory
> because of THP. But the prctl to selectively enable doesn't make too
> much sense, as one has to selectively enabled in a finegrined way to
> be sure not to cause any memory waste. So I think a NOHUGEPAGE prctl
> would be enough.
>
> > meld the two (a per-process threshold, instead of a big-hammer on-off
> > swtich). Let me know if that seems preferable to this idea and we can
> > discuss.
>
> The per-process threshold would be much bigger patch, I think starting
> with the big-hammer on-off is preferable as it is much simpler and it
> should be more than enough to take care of the rare corner cases,
> while leaving the other workloads unaffected (modulo the cacheline to
> check the task or mm flags) running at max speed.
Agreed. While I still would like to explore the threshold idea further,
I'm all for putting in a simpler fix to our current problem that will
leave default behavior unaffected.
> To evaluate the threshold solution, a variety of benchmarks of a
> multitude of apps would be necessary first, to see the effect it has
> on the non-corner cases. Adding the big-hammer on-off prctl instead is
> a black and white design solution that won't require black magic
> settings.
>
> Ideally if we add a threshold later it won't require any more
> cacheline accesses, as the threshold would also need to be per-task or
> per-mm so the runtime cost of the prctl would be zero then and it
> could then become a benchmarking tweak even if we add the per-app
> threshold later.
>
> About creating heuristics to automatically detect the ideal value of
> the big-hammer per-app on/off switch (or even harder the ideal value
> of the per-app threshold), I think it's not going to happen because
> there are too few corner cases and it wouldn't be worth the cost of it
> (the cost would be significant no matter how implemented).
I see where you're coming from here. If we do decide to move further
with implementing a threshold solution in the future, I think the best
idea is to have it default to 1, which would maintain current behavior
and leave the non-corner cases unaffected.
Thanks for your suggestions!
- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists