linux-kernel - Re: [patch] add optimized generic percpu accessors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090115223438.GA7463@elte.hu>
Date:	Thu, 15 Jan 2009 23:34:38 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Tejun Heo <tj@...nel.org>
Cc:	roel kluin <roel.kluin@...il.com>,
	"H. Peter Anvin" <hpa@...or.com>, Brian Gerst <brgerst@...il.com>,
	ebiederm@...ssion.com, cl@...ux-foundation.org,
	rusty@...tcorp.com.au, travis@....com,
	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	steiner@....com, hugh@...itas.com
Subject: Re: [patch] add optimized generic percpu accessors


* Ingo Molnar <mingo@...e.hu> wrote:

> * Ingo Molnar <mingo@...e.hu> wrote:
> 
> > FYI, -tip testing found the following bug with your percpu stuff:
> > 
> > There's an early exception during bootup, on 64-bit x86:
> > 
> >   PANIC: early exception 0e rip 10:ffffffff80276855: error ? cr2 6688
> > 
> >  - gcc version 4.3.2 20081007 (Red Hat 4.3.2-6) (GCC) 
> >  - binutils-2.18.50.0.6-2.x86_64
> > 
> > config attached. You can find the disassembly of lock_release_holdtime() 
> > below - that's where it crashed:
> 
> Gut feeling: we must have gotten a window where the PDA is not right. 
> Note how it crashes in lock_release_holdtime() - that is 
> CONFIG_LOCK_STAT instrumentation - very lowlevel and pervasive. Function 
> tracing is also enabled although it should be inactive at the early 
> stages.
> 
> So i'd take a good look at the PDA setup portion of the boot code, and 
> see whether some spinlock acquire/release can slip in while the 
> PDA/percpu-area is not reliable.

Btw., if this turns out to be a genuine linker bug due to us crossing 
RIP-relative references from minus-1.9G-ish negative addresses up to 
slightly-above-zero positive addresses (or something similar - we are 
pushing the linker here quite a bit), we still have a contingency plan:

We can relocate the percpu symbols to small-negative addresses, and place 
it so that the end of the dynamic percpu area ends at zero.

Then we could bootmem alloc the percpu size plus 4096 bytes. Voila: we've 
got a page free for the canary field, at the end of the percpu area, and 
placed just correctly for gcc to see it at %gs:40.

But this isnt as totally clean and simple as your current patchset, so 
lets only do it if needed. It will also put some constraints on how we can 
shape dynamic percpu areas, and an overflow near the end of the dynamic 
percpu area could affect the canary - but still, it's a workable path as 
well, should the zero-based approach fail.

OTOH, this crash does not have the feeling of a linker bug to me - it has 
the feeling of a setup ordering anomaly.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/