linux-kernel - Re: [PATCH] watchdog/softlockup:Fix incorrect CPU utilization output during softlockup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20250817191825.ef254428d688d987333d4f4e@linux-foundation.org>
Date: Sun, 17 Aug 2025 19:18:25 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: yaozhenguo <yaozhenguo1@...il.com>
Cc: tglx@...utronix.de, yaoma@...ux.alibaba.com, max.kellermann@...os.com,
 lihuafei1@...wei.com, yaozhenguo@...com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] watchdog/softlockup:Fix incorrect CPU utilization
 output during softlockup

On Tue, 12 Aug 2025 16:25:10 +0800 yaozhenguo <yaozhenguo1@...il.com> wrote:

> From: ZhenguoYao <yaozhenguo1@...il.com>
> 
> Since we use 16-bit precision, the raw data will undergo
> integer division, which may sometimes result in data loss.
> This can lead to slightly inaccurate CPU utilization calculations.
> Under normal circumstances, this isn’t an issue.  However,
> when CPU utilization reaches 100%, the calculated result might
> exceed 100%.  For example, with raw data like the following:
> 
> sample_period 400000134 new_stat 83648414036 old_stat 83247417494
> 
> sample_period=400000134/2^24=23
> new_stat=83648414036/2^24=4985
> old_stat=83247417494/2^24=4961
> util=105%
> 
> Below log will output：
> 
> CPU#3 Utilization every 0s during lockup:
>     #1:   0% system,          0% softirq,   105% hardirq,     0% idle
>     #2:   0% system,          0% softirq,   105% hardirq,     0% idle
>     #3:   0% system,          0% softirq,   100% hardirq,     0% idle
>     #4:   0% system,          0% softirq,   105% hardirq,     0% idle
>     #5:   0% system,          0% softirq,   105% hardirq,     0% idle
> 
> To avoid confusion, we enforce a 100% display cap when
> calculations exceed this threshold.
> 
> ...
>
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -444,6 +444,13 @@ static void update_cpustat(void)
>  		old_stat = __this_cpu_read(cpustat_old[i]);
>  		new_stat = get_16bit_precision(cpustat[tracked_stats[i]]);
>  		util = DIV_ROUND_UP(100 * (new_stat - old_stat), sample_period_16);
> +		/* Since we use 16-bit precision, the raw data will undergo

		/*
		 * Since ...

please.

> +		 * integer division, which may sometimes result in data loss,
> +		 * and then result might exceed 100%. To avoid confusion,
> +		 * we enforce a 100% display cap when calculations exceed this threshold.
> +		 */
> +		if (util > 100)
> +			util = 100;
>  		__this_cpu_write(cpustat_util[tail][i], util);
>  		__this_cpu_write(cpustat_old[i], new_stat);
>  	}

Can we do something to make this output more accurate?  For example,

	return (data_ns + (1 << 23)) >> 24LL;

would round to the nearest multiple of 16.8ms?