lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 30 Oct 2007 15:12:26 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Christoph Lameter <clameter@....com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>, matthew@....cx,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	penberg@...helsinki.fi, linux-arch@...r.kernel.org
Subject: Re: [patch 08/10] SLUB: Optional fast path using cmpxchg_local

* Christoph Lameter (clameter@....com) wrote:
> On Tue, 30 Oct 2007, Andrew Morton wrote:
> 
> > Let's cc linux-arch: presumably other architectures can implement cpu-local
> > cmpxchg and would see some benefit from doing so.
> 
> Matheiu had a whole series of cmpxchg_local patches. Ccing him too. I 
> think he has some numbers for other architectures.
>  

Well, I tested it on x86 and AMD64 only. For slub:

Using cmpxchg_local shows a performance improvements of the fast path
goes from a 66% speedup on a Pentium 4 to a 14% speedup on AMD64.

It really depends on how fast cmpxchg_local is vs disabling interrupts.


> > The semantics are "atomic wrt interrutps on this cpu, not atomic wrt other
> > cpus", yes?
> 
> Right.
> 
> > Do you have a feel for how useful it would be for arch maintainers to implement
> > this?  IOW, is it worth their time?
> 
> That depends on the efficiency of a cmpxchg_local vs. the interrupt 
> enable/ disable sequence on a particular arch. On x86 this yields about 
> 50% so it doubles the speed of the fastpath. On other architectures the 
> cmpxchg is so slow that it is not worth it (ia64 f.e.)

As Christoph pointed out, we even saw a small slowdown on ia64 because
there is no concept of atomicity wrt only one CPU. Emulating this with
irq disable has been tried, but just the supplementary memory barriers
hurts performance a bit. We tried to come up with clever macros that
switch between irq disable and cmpxchg_local depending on the
architecture, but all the results were awkward.

I guess it's time for me to repost my patchset. I use interrupt disable
to emulate the cmpxchg_local on architectures that lacks atomic ops.

# cmpxchg_local and cmpxchg64_local standardization
add-cmpxchg-local-to-generic-for-up.patch
i386-cmpxchg64-80386-80486-fallback.patch
add-cmpxchg64-to-alpha.patch
add-cmpxchg64-to-mips.patch
add-cmpxchg64-to-powerpc.patch
add-cmpxchg64-to-x86_64.patch
#
add-cmpxchg-local-to-arm.patch
add-cmpxchg-local-to-avr32.patch
add-cmpxchg-local-to-blackfin.patch
add-cmpxchg-local-to-cris.patch
add-cmpxchg-local-to-frv.patch
add-cmpxchg-local-to-h8300.patch
add-cmpxchg-local-to-ia64.patch
add-cmpxchg-local-to-m32r.patch
fix-m32r-__xchg.patch
fix-m32r-include-sched-h-in-smpboot.patch
local_t_m32r_optimized.patch
add-cmpxchg-local-to-m68k.patch
add-cmpxchg-local-to-m68knommu.patch
add-cmpxchg-local-to-parisc.patch
add-cmpxchg-local-to-ppc.patch
add-cmpxchg-local-to-s390.patch
add-cmpxchg-local-to-sh.patch
add-cmpxchg-local-to-sh64.patch
add-cmpxchg-local-to-sparc.patch
add-cmpxchg-local-to-sparc64.patch
add-cmpxchg-local-to-v850.patch
add-cmpxchg-local-to-xtensa.patch
#
slub-use-cmpxchg-local.patch

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ