lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130922162410.GA10649@laptop.programming.kicks-ass.net>
Date:	Sun, 22 Sep 2013 18:24:10 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc:	"H. Peter Anvin" <hpa@...or.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>,
	Paul Mackerras <paulus@....ibm.com>,
	Ingo Molnar <mingo@...nel.org>,
	James Hogan <james.hogan@...tec.com>,
	"James E.J. Bottomley" <jejb@...isc-linux.org>,
	Helge Deller <deller@....de>,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	"David S. Miller" <davem@...emloft.net>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix

On Sun, Sep 22, 2013 at 02:41:01PM +1000, Benjamin Herrenschmidt wrote:
> On Sun, 2013-09-22 at 14:39 +1000, Benjamin Herrenschmidt wrote:
> > How do you do your per-cpu on x86 ? 

We use a segment offset. Something like:

  inc %gs:var;

would be a per-cpu increment. The actual memory location used for the
memop is the variable address + GS offset.

And our GS offset is per cpu and points to the base of the per cpu
segment for that cpu.

> Also, do you have a half-decent way of getting to per-cpu from asm ?

Yes, see above :-)

Assuming we repurpose r13 as per-cpu base, you could do the whole
this_cpu_* stuff which is locally atomic -- ie. safe against IRQs and
preemption as:

loop:
	lwarx	rt, var, r13
	inc	rt
	stwcx	rt, var, r13
	bne-	loop

Except, I think your ll/sc pair is actually slower than doing:

  local_irq_save(flags)
  var++;
  local_irq_restore(flags)

Esp. with the lazy irq disable you have.

And I'm fairly sure using them as generic per cpu accessors isn't sane,
but I'm not sure PPC64 has other memops with implicit addition like
that.

As to the problem of GCC moving r13 about, some archs have some
exceptions in the register allocator and leave some registers alone.
IIRC MIPS has this and uses one of those (istr there's 2) for the
per cpu base address.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ