[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8a7cea99-0ab5-4dba-bc89-62d4819531eb@cdn77.com>
Date: Tue, 15 Jul 2025 09:01:15 +0200
From: Daniel Sedlak <daniel.sedlak@...77.com>
To: Kuniyuki Iwashima <kuniyu@...gle.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Jonathan Corbet <corbet@....net>, Neal Cardwell <ncardwell@...gle.com>,
David Ahern <dsahern@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>,
Shakeel Butt <shakeel.butt@...ux.dev>, Yosry Ahmed <yosry.ahmed@...ux.dev>,
linux-mm@...ck.org, netdev@...r.kernel.org,
Matyas Hurtik <matyas.hurtik@...77.com>,
Daniel Sedlak <danie.sedlak@...77.com>
Subject: Re: [PATCH v2 net-next 2/2] mm/vmpressure: add tracepoint for socket
pressure detection
Hi Kuniyuki,
On 7/14/25 8:02 PM, Kuniyuki Iwashima wrote:
>> +TRACE_EVENT(memcg_socket_under_pressure,
>> +
>> + TP_PROTO(const struct mem_cgroup *memcg, unsigned long scanned,
>> + unsigned long reclaimed),
>> +
>> + TP_ARGS(memcg, scanned, reclaimed),
>> +
>> + TP_STRUCT__entry(
>> + __field(u64, id)
>> + __field(unsigned long, scanned)
>> + __field(unsigned long, reclaimed)
>> + ),
>> +
>> + TP_fast_assign(
>> + __entry->id = cgroup_id(memcg->css.cgroup);
>> + __entry->scanned = scanned;
>> + __entry->reclaimed = reclaimed;
>> + ),
>> +
>> + TP_printk("memcg_id=%llu scanned=%lu reclaimed=%lu",
>> + __entry->id,
>
> Maybe a noob question: How can we translate the memcg ID
> to the /sys/fs/cgroup/... path ?
IMO this should be really named `cgroup_id` instead of `memcg_id`, but
we kept the latter to keep consistency with the rest of the file.
To find cgroup path you can use:
- find /sys/fs/cgroup/ -inum `memcg_id`, and it will print "path" to the
affected cgroup.
- or you can use bpftrace tracepoint hooks and there is a helper
function [1].
Or we can put the cgroup_path to the tracepoint instead of that ID, but
I feel it can be too much overhead, the paths can be pretty long.
Link: https://bpftrace.org/docs/latest#functions-cgroup_path [1]
> It would be nice to place this patch first and the description of
> patch 2 has how to use the new stat with this tracepoint.
Sure, can do that. However, I am unsure how a good idea is to
cross-reference commits, since each may go through a different tree
because each commit is for a different subsystem. They would have to go
through one tree, right?
>> + __entry->scanned,
>> + __entry->reclaimed)
>> +);
>> +
>> #endif /* _TRACE_MEMCG_H */
>>
>> /* This part must be outside protection */
>> diff --git a/mm/vmpressure.c b/mm/vmpressure.c
>> index bd5183dfd879..aa9583066731 100644
>> --- a/mm/vmpressure.c
>> +++ b/mm/vmpressure.c
>> @@ -21,6 +21,8 @@
>> #include <linux/printk.h>
>> #include <linux/vmpressure.h>
>>
>> +#include <trace/events/memcg.h>
>> +
>> /*
>> * The window size (vmpressure_win) is the number of scanned pages before
>> * we try to analyze scanned/reclaimed ratio. So the window is used as a
>> @@ -317,6 +319,7 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
>> * pressure events can occur.
>> */
>> WRITE_ONCE(memcg->socket_pressure, jiffies + HZ);
>> + trace_memcg_socket_under_pressure(memcg, scanned, reclaimed);
>
> This is triggered only when we enter the memory pressure state
> and not when we leave the state, right ? Is it possible to issue
> such an event ?
AFAIK, the currently used API in the vmpressure function does not have
anything like enter or leave the socket memory pressure state. It only
periodically re-arms the socket pressure for a specific duration. So the
current tracepoint is called when the socket pressure is re-armed.
I am not sure how feasible it is to rewrite it so we have enter and
leave logic. I am not that familiar with those code paths.
Thanks!
Daniel
Powered by blists - more mailing lists