linux-kernel - Re: [cpuops cmpxchg V2 3/5] irq_work: Use per cpu atomics instead of regular atomics

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.1012151059430.13049@router.home>
Date:	Wed, 15 Dec 2010 11:04:42 -0600 (CST)
From:	Christoph Lameter <cl@...ux.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
cc:	Tejun Heo <tj@...nel.org>, akpm@...ux-foundation.org,
	Pekka Enberg <penberg@...helsinki.fi>,
	linux-kernel@...r.kernel.org,
	Eric Dumazet <eric.dumazet@...il.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Subject: Re: [cpuops cmpxchg V2 3/5] irq_work: Use per cpu atomics instead
 of regular atomics

On Wed, 15 Dec 2010, Peter Zijlstra wrote:

> On Wed, 2010-12-15 at 17:32 +0100, Tejun Heo wrote:
> > On 12/14/2010 05:28 PM, Christoph Lameter wrote:
> > > The irq work queue is a per cpu object and it is sufficient for
> > > synchronization if per cpu atomics are used. Doing so simplifies
> > > the code and reduces the overhead of the code.
> > >
> > > Before:
> > >
> > > christoph@...ux-2.6$ size kernel/irq_work.o
> > >    text	   data	    bss	    dec	    hex	filename
> > >     451	      8	      1	    460	    1cc	kernel/irq_work.o
> > >
> > > After:
> > >
> > > christoph@...ux-2.6$ size kernel/irq_work.o
> > >    text	   data	    bss	    dec	    hex	filename
> > >     438	      8	      1	    447	    1bf	kernel/irq_work.o
> > >
> > > Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> >
> > Peter, can you please ack this one?
>
> I guess so, I don't much like the bare preempt_disable/enable there, and
> I'm wondering, aren't %fs prefixed insn slower than regular insn? Does
> it really pay to avoid this one address computation if there's multiple
> users in a function. %fs prefixes do take another byte, so it will also
> result in larger code at some point.

Prefixes are faster than explicit address calculations. A prefix allows
you to integrate the per cpu address calculation into an arithmetic
operation.

A prefix is one byte which is less that multiple arithmetic operations to
calculate an address.

I am not sure that the preempt_disable/enable is needed. They are just
there because you had a get/put_cpu there.

If the code is run from hardirq context then preempt is already disabled.
We can just drop those then.





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/