linux-kernel - Re: [patch 1/4] mm: memcontrol: reduce reclaim invocations for higher order requests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140807153141.GD14734@cmpxchg.org>
Date:	Thu, 7 Aug 2014 11:31:41 -0400
From:	Johannes Weiner <hannes@...xchg.org>
To:	Michal Hocko <mhocko@...e.cz>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Tejun Heo <tj@...nel.org>, linux-mm@...ck.org,
	cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [patch 1/4] mm: memcontrol: reduce reclaim invocations for
 higher order requests

On Thu, Aug 07, 2014 at 03:08:22PM +0200, Michal Hocko wrote:
> On Mon 04-08-14 17:14:54, Johannes Weiner wrote:
> > Instead of passing the request size to direct reclaim, memcg just
> > manually loops around reclaiming SWAP_CLUSTER_MAX pages until the
> > charge can succeed.  That potentially wastes scan progress when huge
> > page allocations require multiple invocations, which always have to
> > restart from the default scan priority.
> > 
> > Pass the request size as a reclaim target to direct reclaim and leave
> > it to that code to reach the goal.
> 
> THP charge then will ask for 512 pages to be (direct) reclaimed. That
> is _a lot_ and I would expect long stalls to achieve this target. I
> would also expect quick priority drop down and potential over-reclaim
> for small and moderately sized memcgs (e.g. memcg with 1G worth of pages
> would need to drop down below DEF_PRIORITY-2 to have a chance to scan
> that many pages). All that done for a charge which can fallback to a
> single page charge.
> 
> The current code is quite hostile to THP when we are close to the limit
> but solving this by introducing long stalls instead doesn't sound like a
> proper approach to me.

THP latencies are actually the same when comparing high limit nr_pages
reclaim with the current hard limit SWAP_CLUSTER_MAX reclaim, although
system time is reduced with the high limit.

High limit reclaim with SWAP_CLUSTER_MAX has better fault latency but
it doesn't actually contain the workload - with 1G high and a 4G load,
the consumption at the end of the run is 3.7G.

So what I'm proposing works and is of equal quality from a THP POV.
This change is complicated enough when we stick to the facts, let's
not make up things based on gut feeling.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/