linux-kernel - Re: [PATCH v12 06/18] vmscan: rename shrink

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131203093323.GC8803@dastard>
Date:	Tue, 3 Dec 2013 20:33:23 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Vladimir Davydov <vdavydov@...allels.com>
Cc:	hannes@...xchg.org, mhocko@...e.cz, dchinner@...hat.com,
	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, cgroups@...r.kernel.org, devel@...nvz.org,
	glommer@...nvz.org, Mel Gorman <mgorman@...e.de>,
	Rik van Riel <riel@...hat.com>
Subject: Re: [PATCH v12 06/18] vmscan: rename shrink_slab() args to make it
 more generic

On Mon, Dec 02, 2013 at 03:19:41PM +0400, Vladimir Davydov wrote:
> Currently in addition to a shrink_control struct shrink_slab() takes two
> arguments, nr_pages_scanned and lru_pages, which are used for balancing
> slab reclaim versus page reclaim - roughly speaking, shrink_slab() will
> try to scan nr_pages_scanned/lru_pages fraction of all slab objects.

Yes, that is it's primary purpose, and the variables explain that
clearly. i.e. it passes a quantity of work to be done, and a value
to relate that to the overall size of the cache that the work was
one on. i.e. they tell us that shrink_slab is trying to stay in
balance with the amount of work that page cache reclaim has just
done.

> However, shrink_slab() is not always called after page cache reclaim.
> For example, drop_slab() uses shrink_slab() to drop as many slab objects
> as possible and thus has to pass phony values 1000/1000 to it, which do
> not make sense for nr_pages_scanned/lru_pages. Moreover, as soon as

Yup, but that's not the primary purpose of the code, and doesn't
require balancing against page cache reclaim. hence the numbers that
are passed in are just there to make the shrinkers iterate
efficiently but without doing too much work in a single scan. i.e.
reclaim in chunks across all caches, rather than try to completely
remove a single cache at a time....

And the reason that this is done? because caches have reclaim
relationships that mean some shrinkers can't make progress until
other shrinkers do their work. Hence to effective free all memory,
we have to iterate repeatedly across all caches. That's what
drop_slab() does.
o
> kmemcg reclaim is introduced, we will have to make up phony values for
> nr_pages_scanned and lru_pages again when doing kmem-only reclaim for a
> memory cgroup, which is possible if the cgroup has its kmem limit less
> than the total memory limit.

I'm missing something here - why would memcg reclaim require
passing phony values? How are you going to keep slab caches in
balance with memory pressure generated by the page cache?

And, FWIW:

>  unsigned long shrink_slab(struct shrink_control *shrink,
> -			  unsigned long nr_pages_scanned,
> -			  unsigned long lru_pages);
> +			  unsigned long fraction, unsigned long denominator);

I'm not sure what "fraction" means in this case. A fraction is made
up of a numerator and denominator, but:

> @@ -243,9 +243,9 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker,
>  	nr = atomic_long_xchg(&shrinker->nr_deferred[nid], 0);
>  
>  	total_scan = nr;
> -	delta = (4 * nr_pages_scanned) / shrinker->seeks;
> +	delta = (4 * fraction) / shrinker->seeks;

 (4 * nr_pages_scanned) is a dividend, while:

>  	delta *= max_pass;
> -	do_div(delta, lru_pages + 1);
> +	do_div(delta, denominator + 1);

(lru_pages + 1) is a divisor.

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/