lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240725151458.GJ13387@noisy.programming.kicks-ass.net>
Date: Thu, 25 Jul 2024 17:14:58 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: zhengzucheng <zhengzucheng@...wei.com>
Cc: mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
	mgorman@...e.de, vschneid@...hat.com, oleg@...hat.com,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH -next] sched/cputime: Fix mul_u64_u64_div_u64() precision
 for cputime

On Thu, Jul 25, 2024 at 10:49:46PM +0800, zhengzucheng wrote:
> Sorry, I made a mistake here. CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set.
> 
> 在 2024/7/25 22:05, Peter Zijlstra 写道:
> > On Thu, Jul 25, 2024 at 12:03:15PM +0000, Zheng Zucheng wrote:
> > > In extreme test scenarios:
> > > the 14th field utime in /proc/xx/stat is greater than sum_exec_runtime,
> > > utime = 18446744073709518790 ns, rtime = 135989749728000 ns
> > > 
> > > In cputime_adjust() process, stime is greater than rtime due to
> > > mul_u64_u64_div_u64() precision problem.
> > > before call mul_u64_u64_div_u64(),
> > > stime = 175136586720000, rtime = 135989749728000, utime = 1416780000.
> > > after call mul_u64_u64_div_u64(),
> > > stime = 135989949653530
> > > 
> > > unsigned reversion occurs because rtime is less than stime.
> > > utime = rtime - stime = 135989749728000 - 135989949653530
> > > 		      = -199925530
> > > 		      = (u64)18446744073709518790
> > > 
> > > Trigger scenario:
> > > 1. User task run in kernel mode most of time.
> > > 2. The ARM64 architecture && CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y &&
> > >     TICK_CPU_ACCOUNTING=y
> > > 
> > > Fix mul_u64_u64_div_u64() conversion precision by reset stime to rtime
> > > 
> > > Fixes: 3dc167ba5729 ("sched/cputime: Improve cputime_adjust()")
> > > Signed-off-by: Zheng Zucheng <zhengzucheng@...wei.com>
> > > ---
> > >   kernel/sched/cputime.c | 2 ++
> > >   1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> > > index aa48b2ec879d..365c74e95537 100644
> > > --- a/kernel/sched/cputime.c
> > > +++ b/kernel/sched/cputime.c
> > > @@ -582,6 +582,8 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
> > >   	}
> > >   	stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > > +	if (unlikely(stime > rtime))
> > > +		stime = rtime;

Ooh,.. I see, this is because the generic fallback for
mul_u64_u64_div_u64() is yuck :/

On x86_64 this is just two instructions and it does a native:

  u64*u64->u128
  u128/u64->u64

And this should never happen. But in the generic case, we appoximate and
urgh.

So yeah, but then perhaps add a comment like:

	/*
	 * Because mul_u64_u64_div_u64() can approximate on some
	 * achitectures; enforce the constraint that: a*b/(b+c) <= a.
	 */
	if (unlikely(stime > rtime))
		stime = rtime;

Also, I would look into doing a native arm64 version, I'd be surprised
if it could not do better than the generic variant.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ