[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <92674f89641f466b9ebbdf7681614ed3@baidu.com>
Date: Tue, 8 Jul 2025 00:10:54 +0000
From: "Li,Rongqing" <lirongqing@...du.com>
To: Steven Rostedt <rostedt@...dmis.org>
CC: Oleg Nesterov <oleg@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
David Laight <david.laight.linux@...il.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "vschneid@...hat.com" <vschneid@...hat.com>,
"mgorman@...e.de" <mgorman@...e.de>, "bsegall@...gle.com"
<bsegall@...gle.com>, "dietmar.eggemann@....com" <dietmar.eggemann@....com>,
"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
"juri.lelli@...hat.com" <juri.lelli@...hat.com>, "mingo@...hat.com"
<mingo@...hat.com>
Subject: 答复: [????] Re: [????] Re: divide error in x86 and cputime
> On Mon, 7 Jul 2025 23:41:14 +0000
> "Li,Rongqing" <lirongqing@...du.com> wrote:
>
> > > On a second thought, this
> > >
> > > mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > > 0x09e00900);
> > > stime rtime
> > > stime + utime
> > >
> > > looks suspicious:
> > >
> > > - stime > stime + utime
> > >
> > > - rtime = 0xfffd213aabd74626 is absurdly huge
> > >
> > > so perhaps there is another problem?
> > >
> >
> > it happened when a process with 236 busy polling threads , run about
> > 904 days, the total time will overflow the 64bit
> >
> > non-x86 system maybe has same issue, once (stime + utime) overflows
> > 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division
> > by 0
> >
> > so to cputime, could cputime_adjust() return stime if stime if stime +
> > utime is overflow
> >
> > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index
> > 6dab4854..db0c273 100644
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr,
> struct prev_cputime *prev,
> > goto update;
> > }
> >
> > + if (stime > (stime + utime)) {
> > + goto update;
> > + }
> > +
> > stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > /*
> > * Because mul_u64_u64_div_u64() can approximate on some
> >
>
> Are you running 5.10.0? Because a diff of 5.10.238 from 5.10.0 gives:
>
> @@ -579,6 +579,12 @@ void cputime_adjust(struct task_cputime *curr, struct
> prev_cputime *prev,
> }
>
> stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> + /*
> + * Because mul_u64_u64_div_u64() can approximate on some
> + * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> + */
> + if (unlikely(stime > rtime))
> + stime = rtime;
My 5.10 has not this patch " sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime ",
but I am sure this patch can not fix this overflow issue, Since division error happened in mul_u64_u64_div_u64()
Thanks
-Li
>
> update:
>
>
> Thus the result is what's getting screwed up.
>
> -- Steve
Powered by blists - more mailing lists