netdev - Re: [RFC PATCH bpf-next v3 09/12] selftests/bpf: Add tests for memcg_bpf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b90069a3-86b4-4fba-9ff3-fe5f6c4e425d@gmail.com>
Date: Fri, 23 Jan 2026 12:47:02 -0800
From: JP Kobryn <inwardvessel@...il.com>
To: Hui Zhu <hui.zhu@...ux.dev>, Andrew Morton <akpm@...ux-foundation.org>,
 Johannes Weiner <hannes@...xchg.org>, Michal Hocko <mhocko@...nel.org>,
 Roman Gushchin <roman.gushchin@...ux.dev>,
 Shakeel Butt <shakeel.butt@...ux.dev>, Muchun Song <muchun.song@...ux.dev>,
 Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
 Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau
 <martin.lau@...ux.dev>, Eduard Zingerman <eddyz87@...il.com>,
 Song Liu <song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>,
 John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>,
 Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>,
 Jiri Olsa <jolsa@...nel.org>, Shuah Khan <shuah@...nel.org>,
 Peter Zijlstra <peterz@...radead.org>, Miguel Ojeda <ojeda@...nel.org>,
 Nathan Chancellor <nathan@...nel.org>, Kees Cook <kees@...nel.org>,
 Tejun Heo <tj@...nel.org>, Jeff Xu <jeffxu@...omium.org>, mkoutny@...e.com,
 Jan Hendrik Farr <kernel@...rr.cc>, Christian Brauner <brauner@...nel.org>,
 Randy Dunlap <rdunlap@...radead.org>, Brian Gerst <brgerst@...il.com>,
 Masahiro Yamada <masahiroy@...nel.org>, davem@...emloft.net,
 Jakub Kicinski <kuba@...nel.org>, Jesper Dangaard Brouer <hawk@...nel.org>,
 Willem de Bruijn <willemb@...gle.com>, Jason Xing
 <kerneljasonxing@...il.com>, Paul Chaignon <paul.chaignon@...il.com>,
 Anton Protopopov <a.s.protopopov@...il.com>, Amery Hung
 <ameryhung@...il.com>, Chen Ridong <chenridong@...weicloud.com>,
 Lance Yang <lance.yang@...ux.dev>, Jiayuan Chen <jiayuan.chen@...ux.dev>,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, cgroups@...r.kernel.org,
 bpf@...r.kernel.org, netdev@...r.kernel.org, linux-kselftest@...r.kernel.org
Cc: Hui Zhu <zhuhui@...inos.cn>, Geliang Tang <geliang@...nel.org>
Subject: Re: [RFC PATCH bpf-next v3 09/12] selftests/bpf: Add tests for
 memcg_bpf_ops

Hi Hui,

On 1/23/26 1:00 AM, Hui Zhu wrote:
> From: Hui Zhu <zhuhui@...inos.cn>
> 
> Add a comprehensive selftest suite for the `memcg_bpf_ops`
> functionality. These tests validate that BPF programs can correctly
> influence memory cgroup throttling behavior by implementing the new
> hooks.
> 
> The test suite is added in `prog_tests/memcg_ops.c` and covers
> several key scenarios:
> 
> 1. `test_memcg_ops_over_high`:
>     Verifies that a BPF program can trigger throttling on a low-priority
>     cgroup by returning a delay from the `get_high_delay_ms` hook when a
>     high-priority cgroup is under pressure.
> 
> 2. `test_memcg_ops_below_low_over_high`:
>     Tests the combination of the `below_low` and `get_high_delay_ms`
>     hooks, ensuring they work together as expected.
> 
> 3. `test_memcg_ops_below_min_over_high`:
>     Validates the interaction between the `below_min` and
>     `get_high_delay_ms` hooks.
> 
> The test framework sets up a cgroup hierarchy with high and low
> priority groups, attaches BPF programs, runs memory-intensive
> workloads, and asserts that the observed throttling (measured by
> workload execution time) matches expectations.
> 
> The BPF program (`progs/memcg_ops.c`) uses a tracepoint on
> `memcg:count_memcg_events` (specifically PGFAULT) to detect memory
> pressure and trigger the appropriate hooks in response. This test
> suite provides essential validation for the new memory control
> mechanisms.
> 
> Signed-off-by: Geliang Tang <geliang@...nel.org>
> Signed-off-by: Hui Zhu <zhuhui@...inos.cn>
> ---
[..]
> diff --git a/tools/testing/selftests/bpf/prog_tests/memcg_ops.c b/tools/testing/selftests/bpf/prog_tests/memcg_ops.c
> new file mode 100644
> index 000000000000..9a8d16296f2d
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/memcg_ops.c
> @@ -0,0 +1,537 @@
[..]
> +
> +static void
> +real_test_memcg_ops_child_work(const char *cgroup_path,
> +			       char *data_filename,
> +			       char *time_filename,
> +			       int read_times)
> +{
> +	struct timeval start, end;
> +	double elapsed;
> +	FILE *fp;
> +
> +	if (!ASSERT_OK(join_parent_cgroup(cgroup_path), "join_parent_cgroup"))
> +		goto out;
> +
> +	if (env.verbosity >= VERBOSE_NORMAL)
> +		printf("%s %d begin\n", __func__, getpid());
> +
> +	gettimeofday(&start, NULL);
> +
> +	if (!ASSERT_OK(write_file(data_filename), "write_file"))
> +		goto out;
> +
> +	if (env.verbosity >= VERBOSE_NORMAL)
> +		printf("%s %d write_file done\n", __func__, getpid());
> +
> +	if (!ASSERT_OK(read_file(data_filename, read_times), "read_file"))
> +		goto out;
> +
> +	gettimeofday(&end, NULL);
> +
> +	elapsed = (end.tv_sec - start.tv_sec) +
> +		  (end.tv_usec - start.tv_usec) / 1000000.0;
> +
> +	if (env.verbosity >= VERBOSE_NORMAL)
> +		printf("%s %d end %.6f\n", __func__, getpid(), elapsed);
> +
> +	fp = fopen(time_filename, "w");
> +	if (!ASSERT_OK_PTR(fp, "fopen"))
> +		goto out;
> +	fprintf(fp, "%.6f", elapsed);
> +	fclose(fp);
> +
> +out:
> +	exit(0);
> +}
> +

[..]

> +static void real_test_memcg_ops(int read_times)
> +{
> +	int ret;
> +	char data_file1[] = "/tmp/test_data_XXXXXX";
> +	char data_file2[] = "/tmp/test_data_XXXXXX";
> +	char time_file1[] = "/tmp/test_time_XXXXXX";
> +	char time_file2[] = "/tmp/test_time_XXXXXX";
> +	pid_t pid1, pid2;
> +	double time1, time2;
> +
> +	ret = mkstemp(data_file1);
> +	if (!ASSERT_GT(ret, 0, "mkstemp"))
> +		return;
> +	close(ret);
> +	ret = mkstemp(data_file2);
> +	if (!ASSERT_GT(ret, 0, "mkstemp"))
> +		goto cleanup_data_file1;
> +	close(ret);
> +	ret = mkstemp(time_file1);
> +	if (!ASSERT_GT(ret, 0, "mkstemp"))
> +		goto cleanup_data_file2;
> +	close(ret);
> +	ret = mkstemp(time_file2);
> +	if (!ASSERT_GT(ret, 0, "mkstemp"))
> +		goto cleanup_time_file1;
> +	close(ret);
> +
> +	pid1 = fork();
> +	if (!ASSERT_GE(pid1, 0, "fork"))
> +		goto cleanup;
> +	if (pid1 == 0)
> +		real_test_memcg_ops_child_work(CG_LOW_DIR,
> +					       data_file1,
> +					       time_file1,
> +					       read_times);

Would it be better to call exit() after real_test_memcg_ops_child_work()
instead of within it? This way the fork/exit/wait logic is contained in
the same scope making the lifetimes easier to track. I had to go back
and search for the call to exit() since at a glance this function
appears to proceed to call fork() and waitpid() from within both parent
and child procs (though it really does not).