[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250721203624.3807041-1-kuniyu@google.com>
Date: Mon, 21 Jul 2025 20:35:19 +0000
From: Kuniyuki Iwashima <kuniyu@...gle.com>
To: "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Neal Cardwell <ncardwell@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Willem de Bruijn <willemb@...gle.com>, Matthieu Baerts <matttbe@...nel.org>,
Mat Martineau <martineau@...nel.org>, Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>,
Shakeel Butt <shakeel.butt@...ux.dev>, Andrew Morton <akpm@...ux-foundation.org>
Cc: Simon Horman <horms@...nel.org>, Geliang Tang <geliang@...nel.org>,
Muchun Song <muchun.song@...ux.dev>, Kuniyuki Iwashima <kuniyu@...gle.com>,
Kuniyuki Iwashima <kuni1840@...il.com>, netdev@...r.kernel.org, mptcp@...ts.linux.dev,
cgroups@...r.kernel.org, linux-mm@...ck.org
Subject: [PATCH v1 net-next 00/13] net-memcg: Allow decoupling memcg from sk->sk_prot->memory_allocated.
Some protocols (e.g., TCP, UDP) has their own memory accounting for
socket buffers and charge memory to global per-protocol counters such
as /proc/net/ipv4/tcp_mem.
When running under a non-root cgroup, this memory is also charged to
the memcg as sock in memory.stat.
Sockets using such protocols are still subject to the global limits,
thus affected by a noisy neighbour outside cgroup.
This makes it difficult to accurately estimate and configure appropriate
global limits.
If all workloads were guaranteed to be controlled under memcg, the issue
can be worked around by setting tcp_mem[0~2] to UINT_MAX.
However, this assumption does not always hold, and a single workload that
opts out of memcg can consume memory up to the global limit, which is
problematic.
This series introduces a new per-memcg know to allow decoupling memcg
from the global memory accounting, which simplifies the memcg
configuration while keeping the global limits within a reasonable range.
Overview of the series:
patch 1 is a bug fix for MPTCP
patch 2 ~ 9 move sk->sk_memcg accesses to a single place
patch 10 moves sk_memcg under CONFIG_MEMCG
patch 11 & 12 introduces a flag and stores it to the lowest bit of sk->sk_memcg
patch 13 decouples memcg from sk_prot->memory_allocated based on the flag
Kuniyuki Iwashima (13):
mptcp: Fix up subflow's memcg when CONFIG_SOCK_CGROUP_DATA=n.
mptcp: Use tcp_under_memory_pressure() in mptcp_epollin_ready().
tcp: Simplify error path in inet_csk_accept().
net: Call trace_sock_exceed_buf_limit() for memcg failure with
SK_MEM_RECV.
net: Clean up __sk_mem_raise_allocated().
net-memcg: Introduce mem_cgroup_from_sk().
net-memcg: Introduce mem_cgroup_sk_enabled().
net-memcg: Pass struct sock to mem_cgroup_sk_(un)?charge().
net-memcg: Pass struct sock to mem_cgroup_sk_under_memory_pressure().
net: Define sk_memcg under CONFIG_MEMCG.
net-memcg: Add memory.socket_isolated knob.
net-memcg: Store memcg->socket_isolated in sk->sk_memcg.
net-memcg: Allow decoupling memcg from global protocol memory
accounting.
Documentation/admin-guide/cgroup-v2.rst | 16 +++++
include/linux/memcontrol.h | 50 ++++++++-----
include/net/proto_memory.h | 10 ++-
include/net/sock.h | 66 +++++++++++++++++
include/net/tcp.h | 10 ++-
mm/memcontrol.c | 84 +++++++++++++++++++---
net/core/sock.c | 95 ++++++++++++++++---------
net/ipv4/inet_connection_sock.c | 35 +++++----
net/ipv4/tcp_output.c | 13 ++--
net/mptcp/protocol.h | 4 +-
net/mptcp/subflow.c | 11 +--
11 files changed, 299 insertions(+), 95 deletions(-)
--
2.50.0.727.gbf7dc18ff4-goog
Powered by blists - more mailing lists