linux-kernel - Re: [External] Re: [RFC 2/5] memcontrol: add boot option to enable memsw account on dfl

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <qfxzzbcfnojz3oz2ackzorwokhmr2dbkxfgbmewd74vtzrzxkh@rlqide3wg2v7>
Date: Fri, 11 Apr 2025 18:57:48 +0200
From: Michal Koutný <mkoutny@...e.com>
To: jingxiang zeng <jingxiangzeng.cas@...il.com>, 
	Zhongkun He <hezhongkun.hzk@...edance.com>
Cc: Shakeel Butt <shakeel.butt@...ux.dev>, 
	Johannes Weiner <hannes@...xchg.org>, Roman Gushchin <roman.gushchin@...ux.dev>, 
	Jingxiang Zeng <linuszeng@...cent.com>, akpm@...ux-foundation.org, linux-mm@...ck.org, 
	cgroups@...r.kernel.org, linux-kernel@...r.kernel.org, mhocko@...nel.org, 
	muchun.song@...ux.dev, kasong@...cent.com
Subject: Re: [External] Re: [RFC 2/5] memcontrol: add boot option to enable
 memsw account on dfl

On Thu, Apr 03, 2025 at 05:16:45PM +0800, jingxiang zeng <jingxiangzeng.cas@...il.com> wrote:
> > We encountered an issue, which is also a real use case. With memory offloading,
> > we can move some cold pages to swap. Suppose an application’s peak memory
> > usage at certain times is 10GB, while at other times, it exists in a
> > combination of
> > memory and swap. If we set limits on memory or swap separately, it would lack
> > flexibility—sometimes it needs 1GB memory + 9GB swap, sometimes 5GB
> > memory + 5GB swap, or even 10GB memory + 0GB swap. Therefore, we strongly
> > hope to use the mem+swap charging method in cgroupv2

App's peak need determines memory.max=10G.
The apparent flexibility is dependency on how much competitors the app
has. It can run 5GB memory + 5GB swap with some competition or 1GB
memory + 9 GB with different competition (more memory demanding).
If you want to prevent faulty app to eating up all of swap for itself
(like it's possible with memsw), you may define some memory.swap.max.
(There's no unique correspondence between this and original memsw value
since the cost of mem<->swap is variable.)


> Yes, in the container scenario, if swap is enabled on the server and
> the customer's container requires 10GB of memory, we only need to set
> memory.memsw.limit_in_bytes=10GB, and the kernel can automatically
> swap out part of the business container's memory to swap according to
> the server's memory pressure, and it can be fully guaranteed that the
> customer's container will not use more memory because swap is enabled
> on the server.

This made me consider various causes of the pressure:

- global pressure
  - it doesn't change memcg's total consuption (memsw.usage=const)
  - memsw limit does nothing
- self-memcg pressure
  - new allocations against own limit and memsw.usage hits memsw.limit
  - memsw.limit prevents new allocations that would extend swap
  - achievable with memory.swap.max=0
- ancestral pressure 
  - when sibling needs to allocate but limit is on ancestor
  - similar to global pressure (memsw.usage=const), self memsw.limit
    does nothing

- or there is no outer pressure but you want to prevent new allocations
  when something has been swapped out already
  - swapped out amount is a debt
    - memsw.limit behavior is suboptimal until the debt needs to be
      repaid
      - repay is when someone else needs the swap space

The above is a free flow of thoughts but I'd condense such conversions:
- memory.max := memory.memsw.limit_in_bytes
- memory.swap.max := anything between 0 and memory.memsw.limit_in_bytes

Did I fail to capture some mode where memsw limits were superior?

Thanks,
Michal

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)