linux-kernel - Re: [PATCH v3] selftests/cgroup: fix cpu.max tests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aHBZJ_k6cSxyAg3x@JPC00244420>
Date: Fri, 11 Jul 2025 09:21:59 +0900
From: Shashank Balaji <shashank.mahadasyam@...y.com>
To: Tejun Heo <tj@...nel.org>
Cc: cgroups@...r.kernel.org, linux-kselftest@...r.kernel.org,
	linux-kernel@...r.kernel.org, Johannes Weiner <hannes@...xchg.org>,
	Michal Koutný <mkoutny@...e.com>,
	Shuah Khan <shuah@...nel.org>,
	Shinya Takumi <shinya.takumi@...y.com>
Subject: Re: [PATCH v3] selftests/cgroup: fix cpu.max tests

Hi Tejun,

Could you please take a look at this patch? After some back-and-forth
with Michal, this is the v3 with his Acked-by.

Thanks,
Shashank

On Fri, Jul 04, 2025 at 08:08:41PM +0900, Shashank Balaji wrote:
> Current cpu.max tests (both the normal one and the nested one) are broken.
> 
> They setup cpu.max with 1000 us quota and the default period (100,000 us).
> A cpu hog is run for a duration of 1s as per wall clock time. This corresponds
> to 10 periods, hence an expected usage of 10,000 us. We want the measured
> usage (as per cpu.stat) to be close to 10,000 us.
> 
> Previously, this approximate equality test was done by
> `!values_close(usage_usec, expected_usage_usec, 95)`: if the absolute
> difference between usage_usec and expected_usage_usec is greater than 95% of
> their sum, then we pass. And expected_usage_usec was set to 1,000,000 us.
> Mathematically, this translates to the following being true for pass:
> 
> 	|usage - expected_usage| > (usage + expected_usage)*0.95
> 
> 	If usage > expected_usage:
> 		usage - expected_usage > (usage + expected_usage)*0.95
> 		0.05*usage > 1.95*expected_usage
> 		usage > 39*expected_usage = 39s
> 
> 	If usage < expected_usage:
> 		expected_usage - usage > (usage + expected_usage)*0.95
> 		0.05*expected_usage > 1.95*usage
> 		usage < 0.0256*expected_usage = 25,600 us
> 
> Combined,
> 
> 	Pass if usage < 25,600 us or > 39 s,
> 
> which makes no sense given that all we need is for usage_usec to be close to
> 10,000 us.
> 
> Fix this by explicitly calcuating the expected usage duration based on the
> configured quota, default period, and the duration, and compare usage_usec
> and expected_usage_usec using values_close() with a 10% error margin.
> 
> Also, use snprintf to get the quota string to write to cpu.max instead of
> hardcoding the quota, ensuring a single source of truth.
> 
> Remove the check comparing user_usec and expected_usage_usec, since on running
> this test modified with printfs, it's seen that user_usec and usage_usec can
> regularly exceed the theoretical expected_usage_usec:
> 
> 	$ sudo ./test_cpu
> 	user: 10485, usage: 10485, expected: 10000
> 	ok 1 test_cpucg_max
> 	user: 11127, usage: 11127, expected: 10000
> 	ok 2 test_cpucg_max_nested
> 	$ sudo ./test_cpu
> 	user: 10286, usage: 10286, expected: 10000
> 	ok 1 test_cpucg_max
> 	user: 10404, usage: 11271, expected: 10000
> 	ok 2 test_cpucg_max_nested
> 
> Hence, a values_close() check of usage_usec and expected_usage_usec is
> sufficient.
> 
> Fixes: a79906570f9646ae17 ("cgroup: Add test_cpucg_max_nested() testcase")
> Fixes: 889ab8113ef1386c57 ("cgroup: Add test_cpucg_max() testcase")
> Acked-by: Michal Koutný <mkoutny@...e.com>
> Signed-off-by: Shashank Balaji <shashank.mahadasyam@...y.com>
> 
> ---
> 
> Changes in v3:
> - Simplified commit message
> - Explained why the "user_usec >= expected_usage_usec" check is removed
> - Added fixes tags and Michal's Acked-by
> - No code changes
> - v2: https://lore.kernel.org/all/20250703120325.2905314-1-shashank.mahadasyam@sony.com/
> 
> Changes in v2:
> - Incorporate Michal's suggestions:
> 	- Merge two patches into one
> 	- Generate the quota string from the variable instead of hardcoding it
> 	- Use values_close() instead of labs()
> 	- Explicitly calculate expected_usage_usec
> - v1: https://lore.kernel.org/all/20250701-kselftest-cgroup-fix-cpu-max-v1-0-049507ad6832@sony.com/
> ---
>  tools/testing/selftests/cgroup/test_cpu.c | 63 ++++++++++++++++-------
>  1 file changed, 43 insertions(+), 20 deletions(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_cpu.c b/tools/testing/selftests/cgroup/test_cpu.c
> index a2b50af8e9ee..2a60e6c41940 100644
> --- a/tools/testing/selftests/cgroup/test_cpu.c
> +++ b/tools/testing/selftests/cgroup/test_cpu.c
> @@ -2,6 +2,7 @@
>  
>  #define _GNU_SOURCE
>  #include <linux/limits.h>
> +#include <sys/param.h>
>  #include <sys/sysinfo.h>
>  #include <sys/wait.h>
>  #include <errno.h>
> @@ -645,10 +646,16 @@ test_cpucg_nested_weight_underprovisioned(const char *root)
>  static int test_cpucg_max(const char *root)
>  {
>  	int ret = KSFT_FAIL;
> -	long usage_usec, user_usec;
> -	long usage_seconds = 1;
> -	long expected_usage_usec = usage_seconds * USEC_PER_SEC;
> +	long quota_usec = 1000;
> +	long default_period_usec = 100000; /* cpu.max's default period */
> +	long duration_seconds = 1;
> +
> +	long duration_usec = duration_seconds * USEC_PER_SEC;
> +	long usage_usec, n_periods, remainder_usec, expected_usage_usec;
>  	char *cpucg;
> +	char quota_buf[32];
> +
> +	snprintf(quota_buf, sizeof(quota_buf), "%ld", quota_usec);
>  
>  	cpucg = cg_name(root, "cpucg_test");
>  	if (!cpucg)
> @@ -657,13 +664,13 @@ static int test_cpucg_max(const char *root)
>  	if (cg_create(cpucg))
>  		goto cleanup;
>  
> -	if (cg_write(cpucg, "cpu.max", "1000"))
> +	if (cg_write(cpucg, "cpu.max", quota_buf))
>  		goto cleanup;
>  
>  	struct cpu_hog_func_param param = {
>  		.nprocs = 1,
>  		.ts = {
> -			.tv_sec = usage_seconds,
> +			.tv_sec = duration_seconds,
>  			.tv_nsec = 0,
>  		},
>  		.clock_type = CPU_HOG_CLOCK_WALL,
> @@ -672,14 +679,19 @@ static int test_cpucg_max(const char *root)
>  		goto cleanup;
>  
>  	usage_usec = cg_read_key_long(cpucg, "cpu.stat", "usage_usec");
> -	user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec");
> -	if (user_usec <= 0)
> +	if (usage_usec <= 0)
>  		goto cleanup;
>  
> -	if (user_usec >= expected_usage_usec)
> -		goto cleanup;
> +	/*
> +	 * The following calculation applies only since
> +	 * the cpu hog is set to run as per wall-clock time
> +	 */
> +	n_periods = duration_usec / default_period_usec;
> +	remainder_usec = duration_usec - n_periods * default_period_usec;
> +	expected_usage_usec
> +		= n_periods * quota_usec + MIN(remainder_usec, quota_usec);
>  
> -	if (values_close(usage_usec, expected_usage_usec, 95))
> +	if (!values_close(usage_usec, expected_usage_usec, 10))
>  		goto cleanup;
>  
>  	ret = KSFT_PASS;
> @@ -698,10 +710,16 @@ static int test_cpucg_max(const char *root)
>  static int test_cpucg_max_nested(const char *root)
>  {
>  	int ret = KSFT_FAIL;
> -	long usage_usec, user_usec;
> -	long usage_seconds = 1;
> -	long expected_usage_usec = usage_seconds * USEC_PER_SEC;
> +	long quota_usec = 1000;
> +	long default_period_usec = 100000; /* cpu.max's default period */
> +	long duration_seconds = 1;
> +
> +	long duration_usec = duration_seconds * USEC_PER_SEC;
> +	long usage_usec, n_periods, remainder_usec, expected_usage_usec;
>  	char *parent, *child;
> +	char quota_buf[32];
> +
> +	snprintf(quota_buf, sizeof(quota_buf), "%ld", quota_usec);
>  
>  	parent = cg_name(root, "cpucg_parent");
>  	child = cg_name(parent, "cpucg_child");
> @@ -717,13 +735,13 @@ static int test_cpucg_max_nested(const char *root)
>  	if (cg_create(child))
>  		goto cleanup;
>  
> -	if (cg_write(parent, "cpu.max", "1000"))
> +	if (cg_write(parent, "cpu.max", quota_buf))
>  		goto cleanup;
>  
>  	struct cpu_hog_func_param param = {
>  		.nprocs = 1,
>  		.ts = {
> -			.tv_sec = usage_seconds,
> +			.tv_sec = duration_seconds,
>  			.tv_nsec = 0,
>  		},
>  		.clock_type = CPU_HOG_CLOCK_WALL,
> @@ -732,14 +750,19 @@ static int test_cpucg_max_nested(const char *root)
>  		goto cleanup;
>  
>  	usage_usec = cg_read_key_long(child, "cpu.stat", "usage_usec");
> -	user_usec = cg_read_key_long(child, "cpu.stat", "user_usec");
> -	if (user_usec <= 0)
> +	if (usage_usec <= 0)
>  		goto cleanup;
>  
> -	if (user_usec >= expected_usage_usec)
> -		goto cleanup;
> +	/*
> +	 * The following calculation applies only since
> +	 * the cpu hog is set to run as per wall-clock time
> +	 */
> +	n_periods = duration_usec / default_period_usec;
> +	remainder_usec = duration_usec - n_periods * default_period_usec;
> +	expected_usage_usec
> +		= n_periods * quota_usec + MIN(remainder_usec, quota_usec);
>  
> -	if (values_close(usage_usec, expected_usage_usec, 95))
> +	if (!values_close(usage_usec, expected_usage_usec, 10))
>  		goto cleanup;
>  
>  	ret = KSFT_PASS;
> -- 
> 2.43.0
>