lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 28 Jun 2010 11:03:11 -0700
From:	Patrick Pannuto <ppannuto@...eaurora.org>
To:	Daniel Walker <dwalker@...eaurora.org>
CC:	linux-kernel@...r.kernel.org, sboyd@...eaurora.org,
	tglx@...utronix.de, mingo@...e.hu, heiko.carstens@...ibm.com,
	eranian@...gle.com, schwidefsky@...ibm.com,
	linux-arm-msm@...r.kernel.org
Subject: Re: [RFC] [PATCH] timer: Added usleep[_range][_interruptable] timer

> 
> You could test residency in specific power states .. Since you want to
> test if your reducing power consumption .. However, I'd replace a ton
> more of these udelay()'s , I don't think you'll get a decent results
> with out that.
> 

Ok.

> 
> The busy waits your replacing are small in length on average .. If you
> have timers that trigger in small intervals then your not going to
> increase residency in any _lower_ power states .. It's possible that you
> could increase residency in the top level power state, but it seem like
> it would be really marginal .. You need to show that udelay()'s have an
> outside of noise impact on something ..
> 

Ok.

> 
> I think you need to do some more research on what your actually doing to
> the system. From what your showing us one could make a lot of different
> arguments as to what this change will actually do. You really need some
> sort of test that doesn't leave a lot of room for argument.
> 

Well here we go then.

First, I would note that IMO is both telling and remarkable that the
following (almost) worked and allowed the system to boot successfully:

(a little long since you can cut off half of it and just get a list of
affected / candidate files, the following is a glorified sed that
replaces all instances of udelay with usleep for udelay's >= 50us)

git grep 'udelay([[:digit:]]\+)' | 
	perl -F"[\(\)]" -anl -e 'print if $F[1] >= 50' | 
	cut -d: -f1 | 
	uniq | 
	grep -v staging | 
	xargs perl -i -wlpe 
		's/udelay/usleep/ if m/udelay\((\d+)\)/g && $1 >= 50'

The usb driver and one power routine that was in a critical section had
to be reverted; everything else ran successfully.


*** MEASUREMENT TECHNIQUE ***

/proc/msm_pm_stats will dump information on residency in power states as
the system is running; dumps were taken 15s, 30s, 60s, 120s, 180s, and
240s from boot. 10 trials were performed for each build. A trial was run
replacing udelay >1000, >500, >100, >=100, >=50.

	origin:		0 changes	(0 total)
	over1000:	106 changes	(106 total)
	over500:	193 changes	(299 total)
	over100:	302 changes	(601 total)
	equal100:	423 changes	(1024 total)
	equal50:	220 changes	(1244 total)

Now, a concern is obviously that these changes hit drivers that are not
used in this system, or that many will be hit only once in boot / 
initialization / etc; this shows plainly in the results and will be
addressed.


*** RESULTS ***

Complete results are attached (please forgive the ugly stats.py, it
evolved; it's functional for the task at hand, but certainly not pretty).

--> Long results also at end of email

>From the final results, after 240s:

TIME	STATE		#TIMES CALLED	 TIME IN STATE		| DELTA FROM ORG

////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) =================================== RUNTIME: 241.535 ===
240	idle-spin    --	2.3              1.45001e-05     
240	not-idle     --	88368.2          67.8311703321   
240	idle-request --	88367.2          23337.6322655   
240	idle-wfi     --	88364.9          172.296383696   
=== over1000 (10 samples) ====== (delta: -0.00999999998604) RUNTIME: 241.525 ===
240	idle-spin    --	3.1              1.91666e-05      	| 4.6665e-06
240	not-idle     --	88723.9          65.6361809172    	| -2.1949894149
240	idle-request --	88722.9          23311.9855603    	| -25.6467051707
240	idle-wfi     --	88719.8          174.493487111    	| 2.1971034149
=== over500 (10 samples) ======== (delta: -0.0339999999851) RUNTIME: 241.501 ===
240	idle-spin    --	2.3              1.88334e-05      	| 4.3333e-06
240	not-idle     --	88636.3          67.0242803241    	| -0.806890008
240	idle-request --	88635.3          23280.1632631    	| -57.469002442
240	idle-wfi     --	88633.0          173.077055869    	| 0.7806721723
=== over100 (10 samples) ======== (delta: -0.0539999999921) RUNTIME: 241.481 ===
240	idle-spin    --	0.9              6.6666e-06       	| -7.8335e-06
240	not-idle     --	88599.0          67.190273164     	| -0.6408971681
240	idle-request --	88597.9          23253.4797828    	| -84.1524827638
240	idle-wfi     --	88597.0          172.884529866    	| 0.5881461694
=== equal100 (10 samples) ================== (delta: -0.025) RUNTIME: 241.51 ===
240	idle-spin    --	1.4              9.5002e-06       	| -4.9999e-06
240	not-idle     --	88685.9          66.5067348407    	| -1.3244354914
240	idle-request --	88684.9          23294.4341497    	| -43.1981158192
240	idle-wfi     --	88683.5          173.60379269     	| 1.3074089936
=== equal50 (10 samples) ======== (delta: -0.0289999999804) RUNTIME: 241.506 ===
240	idle-spin    --	2.0              9.6664e-06       	| -4.8337e-06
240	not-idle     --	88537.4          65.8619214556    	| -1.9692488765
240	idle-request --	88536.4          22979.3665406    	| -358.265724952
240	idle-wfi     --	88534.4          174.247270576    	| 1.9508868794


There appears to be very little change outside of noise from origin through
equal100, however, once equal50 is reached, there is a noticeable change.
Further investigation revealed that before equal100, none of the changes
could be guaranteed to hit the testing target (which, if nothing else,
provides a nice baseline for noise).  The equal100 test included 3 changes
that could positively be sure to hit target and the equal50 test another
4 changes.  The equal50 test was the only test that seems to conclusively
show immediate benefit, but it *does* show benefit.

(here "positively sure" are arch-specific drivers that changed, easy to
search for; it's likely that more did hit the target along the way)


*** CONCLUSIONS ***

I submit that the concept of usleep introduces no harm and does have the
potential for benefit.  I further submit that as processors get faster, the
benefit from a usleep versus a udelay will only improve. Finally, I would note
that the success of the "sed" implies to me that many users of udelay would
prefer the semantics of a usleep call; it is simply unavailable and msleep is
simply too imprecise for callers to use, which is in and of itself a case
for a usleep API.

I invite comments, questions, etc, and am happy to run more tests; however,
I would like to request that if you for any reason believe this data to be
imprecise or insufficient that you please provide me with the *exact* test you
would like me to run and would like to see data from, since it does take a
fair amount of time to run these tests.

Finally, I would like to re-iterate my point that unnecessary busy-waiting,
while it *can* be pre-empted, *is* wasteful and __if it can easily be avoided,
it should be__.

-Pat


*** LONG RESULTS ***

==> Attached is pm_stats.tar.gz, it contains results as TEST_TRIALNO,
flash.sh to run tests (./flash.sh pm_stats TESTNAME) run from an Android
product directory, and stats.py to generate statistics. Forgive stats.py,
it is ugly and an abuse of python.

TIME	STATE		#TIMES CALLED	 TIME IN STATE		| DELTA FROM ORG

////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) ==================================== RUNTIME: 15.253 ===
15	idle-spin    --	1.1              4.8332e-06      
15	not-idle     --	59265.1          9.3842608833    
15	idle-request --	59262.2          21892.7762788   
15	idle-wfi     --	59261.1          4.4754141812    
=== over1000 (10 samples) ======== (delta: -0.00299999999115) RUNTIME: 15.25 ===
15	idle-spin    --	0.7              3.6666e-06       	| -1.1666e-06
15	not-idle     --	59243.9          9.3891611869     	| 0.0049003036
15	idle-request --	59237.1          21897.8631197    	| 5.0868408908
15	idle-wfi     --	59236.4          4.4665700454     	| -0.0088441358
=== over500 (10 samples) ======== (delta: -0.00899999999674) RUNTIME: 15.244 ===
15	idle-spin    --	0.5              2.8334e-06       	| -1.9998e-06
15	not-idle     --	59267.2          9.3819648583     	| -0.002296025
15	idle-request --	59263.2          21894.5342115    	| 1.75793271321
15	idle-wfi     --	59262.7          4.4662633725     	| -0.0091508087
=== over100 (10 samples) ======== (delta: -0.00200000000186) RUNTIME: 15.251 ===
15	idle-spin    --	0.1              1.1666e-06       	| -3.6666e-06
15	not-idle     --	59222.3          9.4040358354     	| 0.0197749521
15	idle-request --	59213.6          21893.1330305    	| 0.356751724303
15	idle-wfi     --	59213.5          4.4462678975     	| -0.0291462837
=== equal100 (10 samples) ======= (delta: -0.00599999999395) RUNTIME: 15.247 ===
15	idle-spin    --	0.1              5e-07            	| -4.3332e-06
15	not-idle     --	59311.6          9.392722999      	| 0.0084621157
15	idle-request --	59306.0          21900.290705     	| 7.5144261487
15	idle-wfi     --	59305.9          4.4606828991     	| -0.0147312821
=== equal50 (10 samples) ====================== (delta: 0.0) RUNTIME: 15.253 ===
15	idle-spin    --	0.6              1.8333e-06       	| -2.9999e-06
15	not-idle     --	59267.0          9.4182486803     	| 0.033987797
15	idle-request --	59261.4          21613.5706545    	| -279.205624353
15	idle-wfi     --	59260.8          4.4418992194     	| -0.0335149618
////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) ==================================== RUNTIME: 30.494 ===
30	idle-spin    --	1.2              5.1666e-06      
30	not-idle     --	69462.2          21.80567204     
30	idle-request --	69462.2          22568.1921778   
30	idle-wfi     --	69461.0          6.8173021886    
=== over1000 (10 samples) ======= (delta: -0.00100000001257) RUNTIME: 30.493 ===
30	idle-spin    --	1.1              6.3332e-06       	| 1.1666e-06
30	not-idle     --	69436.0          21.7969753263    	| -0.0086967137
30	idle-request --	69436.0          22564.1587592    	| -4.0334186002
30	idle-wfi     --	69434.9          6.8029330699     	| -0.0143691187
=== over500 (10 samples) ======== (delta: -0.00700000001816) RUNTIME: 30.487 ===
30	idle-spin    --	0.6              3.1667e-06       	| -1.9999e-06
30	not-idle     --	69474.6          21.7808043443    	| -0.0248676957
30	idle-request --	69474.6          22564.0631218    	| -4.1290559666
30	idle-wfi     --	69474.0          6.7958443852     	| -0.0214578034
=== over100 (10 samples) ========= (delta: 0.00099999998929) RUNTIME: 30.495 ===
30	idle-spin    --	0.2              1.6666e-06       	| -3.5e-06
30	not-idle     --	69379.1          21.8042313156    	| -0.0014407244
30	idle-request --	69379.1          22551.153217     	| -17.0389607525
30	idle-wfi     --	69378.9          6.7590274145     	| -0.0582747741
=== equal100 (10 samples) ======= (delta: -0.00300000001443) RUNTIME: 30.491 ===
30	idle-spin    --	0.2              1e-06            	| -4.1666e-06
30	not-idle     --	69455.7          21.8058526573    	| 0.0001806173
30	idle-request --	69455.7          22570.70418      	| 2.51200219339
30	idle-wfi     --	69455.5          6.7603640729     	| -0.0569381157
=== equal50 (10 samples) ======== (delta: -0.00100000002421) RUNTIME: 30.493 ===
30	idle-spin    --	0.9              3.8332e-06       	| -1.3334e-06
30	not-idle     --	69449.7          21.8319953111    	| 0.0263232711
30	idle-request --	69449.7          22271.8053346    	| -296.386843177
30	idle-wfi     --	69448.8          6.7478177546     	| -0.069484434
////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) ===================================== RUNTIME: 60.78 ===
60	idle-spin    --	1.9              1.16667e-05     
60	not-idle     --	79366.8          49.3036608463   
60	idle-request --	79366.3          22907.1705373   
60	idle-wfi     --	79364.4          9.0110567161    
=== over1000 (10 samples) ========= (delta: 0.0139999999898) RUNTIME: 60.794 ===
60	idle-spin    --	2.0              1.36668e-05      	| 2.0001e-06
60	not-idle     --	79493.9          49.8914426181    	| 0.5877817718
60	idle-request --	79491.4          22894.3020036    	| -12.8685336944
60	idle-wfi     --	79489.4          9.0156314459     	| 0.0045747298
=== over500 (10 samples) ========== (delta: -0.010999999987) RUNTIME: 60.769 ===
60	idle-spin    --	1.5              1.14999e-05      	| -1.668e-07
60	not-idle     --	79513.5          49.7980203079    	| 0.4943594616
60	idle-request --	79512.6          22869.995713     	| -37.1748243636
60	idle-wfi     --	79511.1          9.0027399223     	| -0.0083167938
=== over100 (10 samples) ========= (delta: -0.0229999999865) RUNTIME: 60.757 ===
60	idle-spin    --	0.8              6.3332e-06       	| -5.3335e-06
60	not-idle     --	79399.4          49.8599212971    	| 0.5562604508
60	idle-request --	79396.8          22849.5517755    	| -57.6187617949
60	idle-wfi     --	79396.0          8.9517647664     	| -0.0592919497
=== equal100 (10 samples) ======= (delta: -0.00699999999488) RUNTIME: 60.773 ===
60	idle-spin    --	1.0              6.1667e-06       	| -5.5e-06
60	not-idle     --	79508.7          49.7887594666    	| 0.4850986203
60	idle-request --	79506.3          22878.2886511    	| -28.8818862207
60	idle-wfi     --	79505.3          8.9810057637     	| -0.0300509524
=== equal50 (10 samples) ======== (delta: -0.00799999999581) RUNTIME: 60.772 ===
60	idle-spin    --	1.4              5.9998e-06       	| -5.6669e-06
60	not-idle     --	79401.8          49.7245356251    	| 0.4208747788
60	idle-request --	79401.8          22589.3334706    	| -317.837066708
60	idle-wfi     --	79400.4          8.9224572725     	| -0.0885994436
////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) =================================== RUNTIME: 121.018 ===
120	idle-spin    --	2.3              1.45001e-05     
120	not-idle     --	84534.3          65.4829273526   
120	idle-request --	84533.2          23073.5171825   
120	idle-wfi     --	84530.9          54.1388846978   
=== over1000 (10 samples) ======== (delta: 0.0149999999907) RUNTIME: 121.033 ===
120	idle-spin    --	2.9              1.76667e-05      	| 3.1666e-06
120	not-idle     --	84899.7          62.9539904388    	| -2.5289369138
120	idle-request --	84898.6          23058.6542567    	| -14.8629258454
120	idle-wfi     --	84895.7          56.6840665995    	| 2.5451819017
=== over500 (10 samples) ======= (delta: -0.00999999999767) RUNTIME: 121.008 ===
120	idle-spin    --	2.1              1.61667e-05      	| 1.6666e-06
120	not-idle     --	84810.4          64.5704918038    	| -0.9124355488
120	idle-request --	84809.3          23020.7240229    	| -52.7931596084
120	idle-wfi     --	84807.2          55.0429809068    	| 0.904096209
=== over100 (10 samples) ======== (delta: -0.0209999999963) RUNTIME: 120.997 ===
120	idle-spin    --	0.9              6.6666e-06       	| -7.8335e-06
120	not-idle     --	84773.7          64.6287104755    	| -0.8542168771
120	idle-request --	84772.7          23002.4330668    	| -71.0841157664
120	idle-wfi     --	84771.8          54.9635557389    	| 0.8246710411
=== equal100 (10 samples) ======= (delta: 0.00600000000559) RUNTIME: 121.024 ===
120	idle-spin    --	1.4              9.5002e-06       	| -4.9999e-06
120	not-idle     --	84843.8          63.8751059888    	| -1.6078213638
120	idle-request --	84842.8          23042.7777274    	| -30.7394551461
120	idle-wfi     --	84841.4          55.7466510555    	| 1.6077663577
=== equal50 (10 samples) ======= (delta: -0.00099999998929) RUNTIME: 121.017 ===
120	idle-spin    --	2.0              9.6664e-06       	| -4.8337e-06
120	not-idle     --	84767.4          63.0424664541    	| -2.4404608985
120	idle-request --	84766.4          22740.2895937    	| -333.227588835
120	idle-wfi     --	84764.4          56.5740535877    	| 2.4351688899
////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) =================================== RUNTIME: 181.273 ===
180	idle-spin    --	2.3              1.45001e-05     
180	not-idle     --	86458.1          66.7356990125   
180	idle-request --	86457.1          23202.3433604   
180	idle-wfi     --	86454.8          113.126771849   
=== over1000 (10 samples) ========= (delta: 0.010999999987) RUNTIME: 181.284 ===
180	idle-spin    --	3.0              1.83333e-05      	| 3.8332e-06
180	not-idle     --	86831.3          64.5482361081    	| -2.1874629044
180	idle-request --	86830.3          23181.3150713    	| -21.028289176
180	idle-wfi     --	86827.3          115.331773087    	| 2.2050012376
=== over500 (10 samples) ======== (delta: -0.0250000000117) RUNTIME: 181.248 ===
180	idle-spin    --	2.2              1.68334e-05      	| 2.3333e-06
180	not-idle     --	86733.7          65.9390319914    	| -0.7966670211
180	idle-request --	86732.7          23151.5750952    	| -50.7682652808
180	idle-wfi     --	86730.5          113.915960535    	| 0.7891886856
=== over100 (10 samples) ======== (delta: -0.0339999999968) RUNTIME: 181.239 ===
180	idle-spin    --	0.9              6.6666e-06       	| -7.8335e-06
180	not-idle     --	86696.8          66.0923128287    	| -0.6433861838
180	idle-request --	86695.7          23122.6406168    	| -79.7027435974
180	idle-wfi     --	86694.8          113.744699867    	| 0.6179280181
=== equal100 (10 samples) ======= (delta: -0.0060000000056) RUNTIME: 181.267 ===
180	idle-spin    --	1.4              9.5002e-06       	| -4.9999e-06
180	not-idle     --	86765.3          65.4099056755    	| -1.325793337
180	idle-request --	86764.2          23156.7172921    	| -45.6260683097
180	idle-wfi     --	86762.8          114.457775355    	| 1.3310035056
=== equal50 (10 samples) ======= (delta: -0.00799999998418) RUNTIME: 181.265 ===
180	idle-spin    --	2.0              9.6664e-06       	| -4.8337e-06
180	not-idle     --	86665.1          64.770143973     	| -1.9655550395
180	idle-request --	86664.1          22851.7003915    	| -350.642968977
180	idle-wfi     --	86662.1          115.098642558    	| 1.971870709
////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) =================================== RUNTIME: 241.535 ===
240	idle-spin    --	2.3              1.45001e-05     
240	not-idle     --	88368.2          67.8311703321   
240	idle-request --	88367.2          23337.6322655   
240	idle-wfi     --	88364.9          172.296383696   
=== over1000 (10 samples) ====== (delta: -0.00999999998604) RUNTIME: 241.525 ===
240	idle-spin    --	3.1              1.91666e-05      	| 4.6665e-06
240	not-idle     --	88723.9          65.6361809172    	| -2.1949894149
240	idle-request --	88722.9          23311.9855603    	| -25.6467051707
240	idle-wfi     --	88719.8          174.493487111    	| 2.1971034149
=== over500 (10 samples) ======== (delta: -0.0339999999851) RUNTIME: 241.501 ===
240	idle-spin    --	2.3              1.88334e-05      	| 4.3333e-06
240	not-idle     --	88636.3          67.0242803241    	| -0.806890008
240	idle-request --	88635.3          23280.1632631    	| -57.469002442
240	idle-wfi     --	88633.0          173.077055869    	| 0.7806721723
=== over100 (10 samples) ======== (delta: -0.0539999999921) RUNTIME: 241.481 ===
240	idle-spin    --	0.9              6.6666e-06       	| -7.8335e-06
240	not-idle     --	88599.0          67.190273164     	| -0.6408971681
240	idle-request --	88597.9          23253.4797828    	| -84.1524827638
240	idle-wfi     --	88597.0          172.884529866    	| 0.5881461694
=== equal100 (10 samples) ================== (delta: -0.025) RUNTIME: 241.51 ===
240	idle-spin    --	1.4              9.5002e-06       	| -4.9999e-06
240	not-idle     --	88685.9          66.5067348407    	| -1.3244354914
240	idle-request --	88684.9          23294.4341497    	| -43.1981158192
240	idle-wfi     --	88683.5          173.60379269     	| 1.3074089936
=== equal50 (10 samples) ======== (delta: -0.0289999999804) RUNTIME: 241.506 ===
240	idle-spin    --	2.0              9.6664e-06       	| -4.8337e-06
240	not-idle     --	88537.4          65.8619214556    	| -1.9692488765
240	idle-request --	88536.4          22979.3665406    	| -358.265724952
240	idle-wfi     --	88534.4          174.247270576    	| 1.9508868794

Download attachment "pm_stats.tar.gz" of type "application/x-gzip" (119862 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ