lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250124155420.GA1222@cmpxchg.org>
Date: Fri, 24 Jan 2025 10:54:20 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Hugh Dickins <hughd@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Michal Hocko <mhocko@...e.com>,
	Roman Gushchin <roman.gushchin@...ux.dev>,
	Shakeel Butt <shakeel.butt@...ux.dev>,
	Muchun Song <muchun.song@...ux.dev>, linux-mm@...ck.org,
	cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: memcontrol: move memsw charge callbacks to v1

On Thu, Jan 23, 2025 at 10:53:04PM -0800, Hugh Dickins wrote:
> On Fri, 24 Jan 2025, Johannes Weiner wrote:
> 
> > The interweaving of two entirely different swap accounting strategies
> > has been one of the more confusing parts of the memcg code. Split out
> > the v1 code to clarify the implementation and a handful of callsites,
> > and to avoid building the v1 bits when !CONFIG_MEMCG_V1.
> > 
> >    text	  data	   bss	   dec	   hex	filename
> >   39253	  6446	  4160	 49859	  c2c3	mm/memcontrol.o.old
> >   38877	  6382	  4160	 49419	  c10b	mm/memcontrol.o
> > 
> > Signed-off-by: Johannes Weiner <hannes@...xchg.org>
> 
> I'm not really looking at this, but want to chime in that I found the
> memcg1 swap stuff in mm/memcontrol.c, not in mm/memcontrol-v1.c, very
> misleading when I was doing the folio_unqueue_deferred_split() business:
> so, without looking into the details of it, strongly approve of the
> direction you're taking here - thank you.

Thanks, I'm glad to hear that!

> But thought you could go even further, given that
> static inline bool do_memsw_account(void)
> {
> 	return !cgroup_subsys_on_dfl(memory_cgrp_subsys);
> }
> 
> I thought that amounted to do_memsw_account iff memcg_v1;
> but I never did grasp cgroup_subsys_on_dfl very well,
> so ignore me if I'm making no sense to you.

Yes, technically we should be able to move all the code guarded by
this check to v1 proper in some form.

[ It's a runtime check for whether the memory controller is attached
  to a cgroup1 or a cgroup2 mount. You can still mount the v1
  controller when !CONFIG_MEMCG_V1, but in that case it won't have any
  memory control files, so whether we update the memsw counter or not,
  the results of it won't be visible. ]

But memcg1_swapout()/swapin() are special in that they are completely
separate, v1-specific memcg entry points. The same is not true for the
other occurrences:

- mem_cgroup_margin():
- mem_cgroup_get_max():

	The v1 part is about half the function in both cases. We could
	split that out into a v1 subfunction, but IMO at a relatively
	high cost to the readability of the v1 control flow.

- drain_stock:
- try_charge_memcg:
- uncharge_batch:
- mem_cgroup_replace_folio:
- __mem_cgroup_try_charge_swap:
- __mem_cgroup_uncharge_swap:
- mem_cgroup_get_nr_swap_pages:
- mem_cgroup_swap_full:

	The majority of the code applies to v2 or both versions, and
	the v1 checks either cause an early return or guard the update
	to the memsw page_counter.

	So not much to farm out code-wise. And the test uses a static
	branch, so not much overhead to be cut either.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ