linux-kernel - Re: [patch] mm: vmscan: invoke slab shrinkers from shrink

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150416231753.GC15810@dastard>
Date:	Fri, 17 Apr 2015 09:17:53 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Johannes Weiner <hannes@...xchg.org>
Cc:	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Vladimir Davydov <vdavydov@...allels.com>,
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [patch] mm: vmscan: invoke slab shrinkers from shrink_zone()

On Thu, Apr 16, 2015 at 10:34:13AM -0400, Johannes Weiner wrote:
> On Thu, Apr 16, 2015 at 12:57:36PM +0900, Joonsoo Kim wrote:
> > This causes following success rate regression of phase 1,2 on stress-highalloc
> > benchmark. The situation of phase 1,2 is that many high order allocations are
> > requested while many threads do kernel build in parallel.
> 
> Yes, the patch made the shrinkers on multi-zone nodes less aggressive.
> From the changelog:
> 
>     This changes kswapd behavior, which used to invoke the shrinkers for each
>     zone, but with scan ratios gathered from the entire node, resulting in
>     meaningless pressure quantities on multi-zone nodes.
> 
> So the previous code *did* apply more pressure on the shrinkers, but
> it didn't make any sense.  The number of slab objects to scan for each
> scanned LRU page depended on how many zones there were in a node, and
> their relative sizes.  So a node with a large DMA32 and a small Normal
> would receive vastly different relative slab pressure than a node with
> only one big zone Normal.  That's not something we should revert to.
> 
> If we are too weak on objects compared to LRU pages then we should
> adjust DEFAULT_SEEKS or individual shrinker settings.

Now this thread has my attention. Changing shrinker defaults will
seriously upset the memory balance under load (in unpredictable
ways) so I really don't think we should even consider changing
DEFAULT_SEEKS.

If there's a shrinker imbalance, we need to understand which
shrinker needs rebalancing, then modify that shrinker's
configuration and then observe the impact this has on the rest of
the system. This means looking at variance of the memory footprint
in steady state, reclaim overshoot and damping rates before steady
state is acheived, etc.  Balancing multiple shrinkers (especially
those with dependencies on other caches) under memory
load is a non-trivial undertaking.

I don't see any evidence that we have a shrinker imbalance, so I
really suspect the problem is "shrinkers aren't doing enough work".
In that case, we need to increase the pressure being generated, not
start fiddling around with shrinker configurations.

> If we think our pressure ratio is accurate but we don't reclaim enough
> compared to our compaction efforts, then any adjustments to improve
> huge page successrate should come from the allocator/compaction side.

Right - if compaction is failing, then the problem is more likely
that it isn't generating enough pressure, and so the shrinkers
aren't doing the work we are expecting them to do. That's a problem
with compaction, not the shrinkers...

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/