lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <IA1PR11MB61710CDB2B6B47118832770E89B79@IA1PR11MB6171.namprd11.prod.outlook.com>
Date:   Tue, 7 Mar 2023 07:49:49 +0000
From:   "Zhuo, Qiuxu" <qiuxu.zhuo@...el.com>
To:     "paulmck@...nel.org" <paulmck@...nel.org>
CC:     "Joel Fernandes (Google)" <joel@...lfernandes.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Frederic Weisbecker" <frederic@...nel.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "rcu@...r.kernel.org" <rcu@...r.kernel.org>,
        "urezki@...il.com" <urezki@...il.com>
Subject: RE: [PATCH v3] rcu: Add a minimum time for marking boot as completed

> From: Paul E. McKenney <paulmck@...nel.org>
> [...]
> >
> > Thank you so much Paul for the detailed comments on the measured data.
> >
> > I'm curious how did you figure out the number 24 that we at *least* need.
> > This can guide me on whether the number of samples is enough for
> > future testing ;-).
> 
> It is a rough rule of thumb.  For more details and accuracy, study up on the
> Student's t-test and related statistical tests.
> 
> Of course, this all assumes that the data fits a normal distribution.

Thanks for this extra information. Good to know the Student's t-test.

> > I did another 48 measurements (2x of 24) for each case (w/o and w/
> > Joel's v2 patch) as below.
> > All the testing configurations for the new testing are the same as
> > before.
> >
> > a) Measured 48 times w/o v2 patch (in seconds):
> >     8.4, 8.8, 9.2, 9.0, 8.3, 9.6, 8.8, 9.4,
> >     8.7, 9.2, 8.3, 9.4, 8.4, 9.6, 8.5, 8.8,
> >     8.8, 8.9, 9.3, 9.2, 8.6, 9.7, 9.2, 8.8,
> >     8.7, 9.0, 9.1, 9.5, 8.6, 8.9, 9.1, 8.6,
> >     8.2, 9.1, 8.8, 9.2, 9.1, 8.9, 8.4, 9.0,
> >     9.8, 9.8, 8.7, 8.8, 9.1, 9.5, 9.5, 8.7
> >     The average OS boot time was: ~9.0s
> 
> The range is 8.2 through 9.8.
> 
> > b) Measure 48 times w/ v2 patch (in seconds):
> >     7.7, 8.6, 8.1, 7.8, 8.2, 8.2, 8.8, 8.2,
> >     9.8, 8.0, 9.2, 8.8, 9.2, 8.5, 8.4, 9.2,
> >     8.5, 8.3, 8.1, 8.3, 8.6, 7.9, 8.3, 8.3,
> >     8.6, 8.9, 8.0, 8.5, 8.4, 8.6, 8.7, 8.0,
> >     8.8, 8.8, 9.1, 7.9, 9.7, 7.9, 8.2, 7.8,
> >     8.1, 8.5, 8.6, 8.4, 9.2, 8.6, 9.6, 8.3,
> >     The average OS boot time was: ~8.5s
> 
> The range is 7.7 through 9.8.
> 
> There is again significant overlap, so it is again unclear that you have a
> statistically significant difference.  So could you please calculate the standard
> deviations?

a's standard deviation is ~0.4.
b's standard deviation is ~0.5.

a's average 9.0 is at the upbound of the standard deviation of b's [8.0, 9].
So, the measurements should be statistically significant to some degree.

The calculated standard deviations are via: 
https://www.gigacalculator.com/calculators/standard-deviation-calculator.php

> > @Joel Fernandes (Google), you may replace my old data with the above
> > new data in your commit message.
> >
> > > But we can apply the binomial distribution instead of the usual
> > > normal distribution.  First, let's sort and take the medians:
> > >
> > > a: 8.2 8.3 8.4 8.6 8.7 8.7 8.8 8.8 9.0 9.3  Median: 8.7
> > > b: 7.6 7.8 8.2 8.2 8.2 8.2 8.4 8.5 8.7 9.3  Median: 8.2
> > >
> > > 8/10 of a's data points are greater than 0.1 more than b's median
> > > and 8/10 of b's data points are less than 0.1 less than a's median.
> > > What are the odds that this happens by random chance?
> > >
> > > This is given by sum_0^2 (0.5^10 * binomial(10,i)), which is about 0.055.
> >
> > What's the meaning of 0.5 here? Was it the probability (we assume?)
> > that each time b's data point failed (or didn't satisfy) "less than
> > 0.1 less than a's median"?
> 
> The meaning of 0.5 is the probability of a given data point being on one side
> or the other of the corresponding distribution's median.  This of course
> assumes that the median of the measured data matches that of the
> corresponding distribution, though the fact that the median is also a mode of
> both of the old data sets gives some hope.

  Thanks for the detailed comments on the meaning of 0.5 here. :-)

> The meaning of the 0.1 is the smallest difference that the data could measure.
> I could have instead chosen 0.0 and asked if there was likely some (perhaps
> tiny) difference, but instead, I chose to ask if there was likely some small but
> meaningful difference.  It is better to choose the desired difference before
> measuring the data.

  Thanks for the detailed comments on the meaning of 0.1 here. :-)

> Why don't you try applying this approach to the new data?  You will need the
> general binomial formula.

   Thank you Paul for the suggestion. 
   I just tried it, but not sure whether my analysis was correct ...

   Analysis 1:
   a's median is 8.9. 
   35/48 b's data points are less than 0.1 less than a's median.
   For a's binomial distribution P(X >= 35) = 0.1%, where p=0.5.
   So, we have strong confidence that b is 100ms faster than a.

   Analysis 2:
   a's median - 0.4 = 8.9 - 0.4 = 8.5. 
   24/48 b's data points are less than 0.4 less than a's median.
   The probability that a's data points are less than 8.5 is p = 7/48 = 0.1458 
   For a's binomial distribution P(X >= 24) = 0.0%, where p=0.1458.
   So, looks like we have confidence that b is 400ms faster than a.

   The calculated cumulative binomial distributions P(X) is via:
   https://www.gigacalculator.com/calculators/binomial-probability-calculator.php

   I apologize if this analysis/discussion bored some of you. ;-)

-Qiuxu

> [...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ