linux-kernel - Re: [PATCH] Revert "mm:vmscan: fix inaccurate reclaim during proactive reclaim"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Za-H8NNW9bL-I4gj@tiehlicka>
Date: Tue, 23 Jan 2024 10:33:36 +0100
From: Michal Hocko <mhocko@...e.com>
To: "T.J. Mercier" <tjmercier@...gle.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
	Roman Gushchin <roman.gushchin@...ux.dev>,
	Shakeel Butt <shakeelb@...gle.com>,
	Muchun Song <muchun.song@...ux.dev>,
	Andrew Morton <akpm@...ux-foundation.org>, android-mm@...gle.com,
	yuzhao@...gle.com, yangyifei03@...ishou.com,
	cgroups@...r.kernel.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Revert "mm:vmscan: fix inaccurate reclaim during
 proactive reclaim"

On Sun 21-01-24 21:44:12, T.J. Mercier wrote:
> This reverts commit 0388536ac29104a478c79b3869541524caec28eb.
> 
> Proactive reclaim on the root cgroup is 10x slower after this patch when
> MGLRU is enabled, and completion times for proactive reclaim on much
> smaller non-root cgroups take ~30% longer (with or without MGLRU).

What is the reclaim target in these pro-active reclaim requests?

> With
> root reclaim before the patch, I observe average reclaim rates of
> ~70k pages/sec before try_to_free_mem_cgroup_pages starts to fail and
> the nr_retries counter starts to decrement, eventually ending the
> proactive reclaim attempt.

Do I understand correctly that the reclaim target is over estimated and
you expect that the reclaim process breaks out early>

> After the patch the reclaim rate is
> consistently ~6.6k pages/sec due to the reduced nr_pages value causing
> scan aborts as soon as SWAP_CLUSTER_MAX pages are reclaimed. The
> proactive reclaim doesn't complete after several minutes because
> try_to_free_mem_cgroup_pages is still capable of reclaiming pages in
> tiny SWAP_CLUSTER_MAX page chunks and nr_retries is never decremented.

I do not understand this part. How does a smaller reclaim target manages
to have reclaimed > 0 while larger one doesn't?
 
> The docs for memory.reclaim say, "the kernel can over or under reclaim
> from the target cgroup" which this patch was trying to fix. Revert it
> until a less costly solution is found.
> 
> Signed-off-by: T.J. Mercier <tjmercier@...gle.com>
> ---
>  mm/memcontrol.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e4c8735e7c85..cee536c97151 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6956,8 +6956,8 @@ static ssize_t memory_reclaim(struct kernfs_open_file *of, char *buf,
>  			lru_add_drain_all();
>  
>  		reclaimed = try_to_free_mem_cgroup_pages(memcg,
> -					min(nr_to_reclaim - nr_reclaimed, SWAP_CLUSTER_MAX),
> -					GFP_KERNEL, reclaim_options);
> +						nr_to_reclaim - nr_reclaimed,
> +						GFP_KERNEL, reclaim_options);
>  
>  		if (!reclaimed && !nr_retries--)
>  			return -EAGAIN;
> -- 
> 2.43.0.429.g432eaa2c6b-goog
> 

-- 
Michal Hocko
SUSE Labs