[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190719143742.GA32243@redhat.com>
Date: Fri, 19 Jul 2019 16:37:42 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Fox <afox@...hat.com>,
Stephen Johnston <sjohnsto@...hat.com>,
linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Stanislaw Gruszka <sgruszka@...hat.com>
Subject: Re: [PATCH] sched/cputime: make scale_stime() more precise
On 07/19, Peter Zijlstra wrote:
>
> > > $ ./stime 300000
> > > start=300000000000000
> > > ut(diff)/st(diff): 299994875 ( 0) 300009124 (2000)
> > > ut(diff)/st(diff): 299994875 ( 0) 300011124 (2000)
> > > ut(diff)/st(diff): 299994875 ( 0) 300013124 (2000)
> > > ut(diff)/st(diff): 299994875 ( 0) 300015124 (2000)
> > > ut(diff)/st(diff): 299994875 ( 0) 300017124 (2000)
> > > ut(diff)/st(diff): 299994875 ( 0) 300019124 (2000)
> > > ut(diff)/st(diff): 299994875 ( 0) 300021124 (2000)
> > > ut(diff)/st(diff): 299994875 ( 0) 300023124 (2000)
> > > ut(diff)/st(diff): 299994875 ( 0) 300025124 (2000)
> > > ut(diff)/st(diff): 299994875 ( 0) 300027124 (2000)
> > > ut(diff)/st(diff): 299994875 ( 0) 300029124 (2000)
> > > ut(diff)/st(diff): 299996875 (2000) 300029124 ( 0)
> > > ut(diff)/st(diff): 299998875 (2000) 300029124 ( 0)
> > > ut(diff)/st(diff): 300000875 (2000) 300029124 ( 0)
> > > ut(diff)/st(diff): 300002875 (2000) 300029124 ( 0)
> > > ut(diff)/st(diff): 300004875 (2000) 300029124 ( 0)
> > > ut(diff)/st(diff): 300006875 (2000) 300029124 ( 0)
> > > ut(diff)/st(diff): 300008875 (2000) 300029124 ( 0)
> > > ut(diff)/st(diff): 300010875 (2000) 300029124 ( 0)
> > > ut(diff)/st(diff): 300012055 (1180) 300029944 ( 820)
> > > ut(diff)/st(diff): 300012055 ( 0) 300031944 (2000)
> > > ut(diff)/st(diff): 300012055 ( 0) 300033944 (2000)
> > > ut(diff)/st(diff): 300012055 ( 0) 300035944 (2000)
> > > ut(diff)/st(diff): 300012055 ( 0) 300037944 (2000)
> > >
> > > shows the problem even when sum_exec_runtime is not that big: 300000 secs.
> > >
> > > The new implementation of scale_stime() does the additional div64_u64_rem()
> > > in a loop but see the comment, as long it is used by cputime_adjust() this
> > > can happen only once.
> >
> > That only shows something after long long staring :/ There's no words on
> > what the output actually means or what would've been expected.
> >
> > Also, your example is incomplete; the below is a test for scale_stime();
> > from this we can see that the division results in too large a number,
> > but, important for our use-case in cputime_adjust(), it is a step
> > function (due to loss in precision) and for every plateau we shift
> > runtime into the wrong bucket.
>
> But I'm still confused, since in the long run, it should still end up
> with a proportionally divided user/system, irrespective of some short
> term wobblies.
Why?
Yes, statistically the numbers are proportionally divided.
but you will (probably) never see the real stime == 1000 && utime == 10000
numbers if you watch incrementally.
Just in case... yes I know that these numbers can only "converge" to the
reality, only their sum is correct. But people complain.
Oleg.
Powered by blists - more mailing lists