lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 3 May 2011 17:58:54 -0700 From: Nikhil Rao <ncrao@...gle.com> To: Ingo Molnar <mingo@...e.hu> Cc: Peter Zijlstra <peterz@...radead.org>, Mike Galbraith <efault@....de>, linux-kernel@...r.kernel.org, "Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>, Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>, Stephan Barwolf <stephan.baerwolf@...ilmenau.de> Subject: Re: [PATCH v1 00/19] Increase resolution of load weights On Sun, May 1, 2011 at 11:14 PM, Ingo Molnar <mingo@...e.hu> wrote: > > * Nikhil Rao <ncrao@...gle.com> wrote: > >> 1. Performance costs >> >> Ran 50 iterations of Ingo's pipe-test-100k program (100k pipe ping-pongs). >> See http://thread.gmane.org/gmane.linux.kernel/1129232/focus=1129389 for more >> info. >> >> 64-bit build. >> >> 2.6.39-rc5 (baseline): >> >> Performance counter stats for './pipe-test-100k' (50 runs): >> >> 905,034,914 instructions # 0.345 IPC ( +- 0.016% ) >> 2,623,924,516 cycles ( +- 0.759% ) >> >> 1.518543478 seconds time elapsed ( +- 0.513% ) >> >> 2.6.39-rc5 + patchset: >> >> Performance counter stats for './pipe-test-100k' (50 runs): >> >> 905,351,545 instructions # 0.343 IPC ( +- 0.018% ) >> 2,638,939,777 cycles ( +- 0.761% ) >> >> 1.509101452 seconds time elapsed ( +- 0.537% ) >> >> There is a marginal increase in instruction retired, about 0.034%; and marginal >> increase in cycles counted, about 0.57%. > > Not sure this increase is statistically significant: both effects are within > noise and look at elapsed time, it actually went down. > > Btw., to best measure context-switching costs you should do something like: > > taskset 1 perf stat --repeat 50 ./pipe-test-100k > > to pin both tasks to the same CPU. This reduces noise and makes the numbers > more relevant: SMP costs do not increase due to your patchset. > > So it would be nice to re-run the 64-bit tests with the pipe test bound to a > single CPU. I re-ran the 64-bit tests with the pipe test bound to a single CPU. Data attached below. 2.6.39-rc5: Performance counter stats for './pipe-test-100k' (100 runs): 855,571,900 instructions # 0.869 IPC ( +- 0.637% ) 984,213,635 cycles ( +- 0.254% ) 0.796129773 seconds time elapsed ( +- 0.152% ) 2.6.39-rc5 + patchset: Performance counter stats for './pipe-test-100k' (100 runs): 905,553,828 instructions # 0.934 IPC ( +- 0.059% ) 969,792,787 cycles ( +- 0.168% ) 0.788676004 seconds time elapsed ( +- 0.122% ) There is a 5.8% increase in instructions which is statistically significant and well over the error margins. Cycles dropped by about 1.17% and elapsed time also dropped about ~1%. I'm looking into profiles for this test to understand why instr has increased. > >> 32-bit build. >> >> 2.6.39-rc5 (baseline): >> >> Performance counter stats for './pipe-test-100k' (50 runs): >> >> 1,025,151,722 instructions # 0.238 IPC ( +- 0.018% ) >> 4,303,226,625 cycles ( +- 0.524% ) >> >> 2.133056844 seconds time elapsed ( +- 0.619% ) >> >> 2.6.39-rc5 + patchset: >> >> Performance counter stats for './pipe-test-100k' (50 runs): >> >> 1,070,610,068 instructions # 0.239 IPC ( +- 1.369% ) >> 4,478,912,974 cycles ( +- 1.011% ) >> >> 2.293382242 seconds time elapsed ( +- 0.144% ) >> >> On 32-bit kernels, instructions retired increases by about 4.4% with this >> patchset. CPU cycles also increases by about 4%. >> >> There is a marginal increase in instruction retired, about 0.034%; and >> marginal increase in cycles counted, about 0.57%. > > These results look more bothersome, a clear increase in both cycles, elapsed > time, and instructions retired, well beyond measurement noise. > > Given that scheduling costs are roughly 30% of that pipe test-case, the cost > increase to the scheduler is probably around: > > instructions: +14.5% > cycles: +13.3% > > That is rather significant. > I'll take a closer look at the performance of this patchset this week. I'm a little confused about how you calculated the cost to the scheduler. How did you come up with 14.5 % and 13.3%? Also, out of curiosity, what's an acceptable tolerance level for a performance hit on 32-bit? -Thanks Nikhil > Thanks, > > Ingo > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists