lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 1 Nov 2013 10:13:37 +0100 From: Ingo Molnar <mingo@...nel.org> To: Neil Horman <nhorman@...driver.com> Cc: Eric Dumazet <eric.dumazet@...il.com>, linux-kernel@...r.kernel.org, sebastien.dugue@...l.net, Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org, netdev@...r.kernel.org Subject: Re: [PATCH] x86: Run checksumming in parallel accross multiple alu's * Neil Horman <nhorman@...driver.com> wrote: > On Thu, Oct 31, 2013 at 11:22:00AM +0100, Ingo Molnar wrote: > > > > * Neil Horman <nhorman@...driver.com> wrote: > > > > > > etc. For such short runtimes make sure the last column displays > > > > close to 100%, so that the PMU results become trustable. > > > > > > > > A nehalem+ PMU will allow 2-4 events to be measured in parallel, > > > > plus generics like 'cycles', 'instructions' can be added 'for free' > > > > because they get counted in a separate (fixed purpose) PMU register. > > > > > > > > The last colum tells you what percentage of the runtime that > > > > particular event was actually active. 100% (or empty last column) > > > > means it was active all the time. > > > > > > > > Thanks, > > > > > > > > Ingo > > > > > > > > > > Hmm, > > > > > > I ran this test: > > > > > > for i in `seq 0 1 3` > > > do > > > echo $i > /sys/module/csum_test/parameters/module_test_mode > > > taskset -c 0 perf stat --repeat 20 -C 0 -e L1-dcache-load-misses -e L1-dcache-prefetches -e cycles -e instructions -ddd ./test.sh > > > done > > > > You need to remove '-ddd' which is a shortcut for a ton of useful > > events, but here you want to use fewer events, to increase the > > precision of the measurement. > > > > Thanks, > > > > Ingo > > > > Thank you ingo, that fixed it. I'm trying some other variants of > the csum algorithm that Doug and I discussed last night, but FWIW, > the relative performance of the 4 test cases > (base/prefetch/parallel/both) remains unchanged. I'm starting to > feel like at this point, theres very little point in doing > parallel alu operations (unless we can find a way to break the > dependency on the carry flag, which is what I'm tinkering with > now). I would still like to encourage you to pick up the improvements that Doug measured (mostly via prefetch tweaking?) - that looked like some significant speedups that we don't want to lose! Also, trying to stick the in-kernel implementation into 'perf bench' would be a useful first step as well, for this and future efforts. See what we do in tools/perf/bench/mem-memcpy-x86-64-asm.S to pick up the in-kernel assembly memcpy implementations: #define memcpy MEMCPY /* don't hide glibc's memcpy() */ #define altinstr_replacement text #define globl p2align 4; .globl #define Lmemcpy_c globl memcpy_c; memcpy_c #define Lmemcpy_c_e globl memcpy_c_e; memcpy_c_e #include "../../../arch/x86/lib/memcpy_64.S" So it needed a bit of trickery/wrappery for 'perf bench mem memcpy', but that is a one-time effort - once it's done then the current in-kernel csum_partial() implementation would be easily measurable (and any performance regression in it bisectable, etc.) from that point on. In user-space it would also be easier to add various parameters and experimental implementations and background cache-stressing workloads automatically. Something similar might be possible for csum_partial(), csum_partial_copy*(), etc. Note, if any of you ventures to add checksum-benchmarking to perf bench, please base any patches on top of tip:perf/core: git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/core as there are a couple of perf bench enhancements in the pipeline already for v3.13. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists