lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 26 Feb 2016 11:13:16 +0000
From:	Mel Gorman <mgorman@...hsingularity.net>
To:	Andrea Arcangeli <aarcange@...hat.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Vlastimil Babka <vbabka@...e.cz>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Linux-MM <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/1] mm: thp: Redefine default THP defrag behaviour
 disable it by default

On Fri, Feb 26, 2016 at 12:02:19AM +0100, Andrea Arcangeli wrote:
> On Thu, Feb 25, 2016 at 07:56:13PM +0000, Mel Gorman wrote:
> > Which is a specialised case that does not apply to all users. Remember
> > that the data showed that a basic streaming write of an anon mapping on
> > a freshly booted NUMA system was enough to stall the process for long
> > periods of time.
> > 
> > Even in the specialised case, a single VM reaching its peak performance
> > may rely on getting THP but if that's at the cost of reclaiming other
> > pages that may be hot to a second VM then it's an overall loss.
> 
> You're mixing the concern of that THP will use more memory with the
> cost of defragmentation.

There are three cases

1. THP was allocated when the application only required 4K and consumes
   more memory. This has always been the case but not the concern here
2. Memory is fragmented but there are enough free pages. In this case,
   only compaction is required and the memory footprint is the same
3. Memory is fragmentation and pages have to be freed before compaction.

It's 3 I was referred to even though all the cases are important.

> If you've memory issues and you are ok to
> sacrifice performance for swapping less you should disable THP, set it
> to never, and that's it.
> 

I want to get to the half-way point where THP is used if easily available
without worrying that there will be stalls at some point in the future
or requiring application modification for madvise. That's better than the
all or nothing approach that users are currently faced with. I wince every
time I see a tuning guide suggesting THP be disabled and have handled too
many bugs where disabling THP was a workaround.

That said, you made a number of important points. I'm not going to respond
to them individually because I believe I understand your concerns and now
agree with them.  I've prototyped a patch that modifies the defrag tunable
as follows;

1. By default, "madvise" and direct reclaim/compaction for applications
   that specifically requested that behaviour. This will avoid breaking
   MADV_HUGEPAGE which you mentioned in a few places
2. "never" will never reclaim anything and was the default behaviour of
   version 1 but will not be the default in version 2.
3. "defer" will wake kswapd which will reclaim or wake kcompactd
   whichever is necessary. This is new but avoids stalls while helping
   khugepaged do its work quickly in the near future.
4. "always" will direct reclaim/compact just like todays behaviour

I'm testing it at the moment to make sure each of the options behave
correctly.

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ