[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <454E6A2A.5070002@vmware.com>
Date: Sun, 05 Nov 2006 14:48:10 -0800
From: Zachary Amsden <zach@...are.com>
To: Linus Torvalds <torvalds@...l.org>
Cc: Arjan van de Ven <arjan@...radead.org>, Andi Kleen <ak@...e.de>,
Benjamin LaHaise <bcrl@...ck.org>,
Chuck Ebbert <76306.1226@...puserve.com>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [rfc patch] i386: don't save eflags on task switch
Linus Torvalds wrote:
> On Sun, 5 Nov 2006, Arjan van de Ven wrote:
>
>> actually lockdep is pretty good at finding this type of bug IMMEDIATELY
>> even without the actual race triggering ;)
>>
>
> Ehh. Last time this happened, lockdep didn't find _squat_.
>
> This was when NT and AC leaked across context switches, because the
> context switching had removed the "expensive" save/restore.
>
Owning up to being the one who introduced the thing. Actually, it was a
pretty nice win for native, and a huge win for paravirtualization;
calling out to two helper functions for save / restore flags while
shuffling the stack is just awfully bad during such a critical region.
If you look back all the way to 2.4 kernel series, there was no save /
restore flags, and it didn't look like there ever was. Somewhere during
2.5 development, it migrated in as an unchangelogged fix, and I was able
to dig up an email thread and reason that IOPL was leaking. Course,
instead of thinking it all the way through, I thought the precedent of
having no eflags switching would be good enough with an explicit IOPL
switch. Then nasty AC / NT raised their heads.
ID can be a problem as well; system calls during a code region which is
testing for a Pentium by toggling the ID bit (perhaps from a printf()
libc call) can cause the ID bit to leak onto another process or get
lost. causing CPUID detection to fail.
I like Chuck's new set_eflags() since it fixes all this in a way we
don't have to reason about heavily. Also, moving it to C code instead
of the assembler path is more maintainable. IMHO, the assembler task
switch should switch the stack, which you can't do in C, and that is
it. Everything else can be nicely packaged above it, including the
get_eflags().
Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists