[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100304202304.GA13718@elte.hu>
Date: Thu, 4 Mar 2010 21:23:04 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Steven Rostedt <rostedt@...dmis.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Nicholas Miell <nmiell@...cast.net>, laijs@...fujitsu.com,
dipankar@...ibm.com, akpm@...ux-foundation.org,
josh@...htriplett.org, dvhltc@...ibm.com, niv@...ibm.com,
tglx@...utronix.de, peterz@...radead.org, Valdis.Kletnieks@...edu,
dhowells@...hat.com, linux-kernel@...r.kernel.org,
Nick Piggin <npiggin@...e.de>,
Chris Friesen <cfriesen@...tel.com>,
Fr??d??ric Weisbecker <fweisbec@...il.com>
Subject: Re: [PATCH -tip] introduce sys_membarrier(): process-wide memory
barrier (v9)
* Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>
> On Thu, 4 Mar 2010, Ingo Molnar wrote:
> >
> > - SA_NOFPU: on x86 to skip the FPU/SSE save/restore, for such fast in/out special
> > purpose signal handlers? (can whip up a quick patch for you if you want)
>
> I'd love to do this, but it's wrong.
>
> It's too damn easy to use the FPU by mistake in user land, without ever
> being aware of it. memset()/memcpy are obvious potential users SSE, but they
> might be called in non-obvious ways implicitly by the compiler (ie structure
> copy and setup).
>
> And modern glibc ends up using SSE4 even for things like strstr and strlen,
> so it really is creeping into all kinds of trivial helper functions that
> might not be obvious. So SA_NOFPU is a lovely idea, but it's also an idea
> that sucks rotten eggs in practice, with quite possibly the same _binary_
> working or not working depending on what kind of CPU and what shared library
> it happens to be using.
>
> Too damn fragile, in other words.
>
> (Now, if it's accompanied by the kernel actually _testing_ that there is no
> FPU activity, by setting the TS flag and checking at fault time and causing
> a SIGFPE, then that would be better. At least you'd get a nice clear signal
> rather than random FPU state corruption. But you're still in the situation
> that now the binary might work on some machines and setups, and not on
> others.
Perhaps NOFPU could do lazy context saving: clear the TS flag and only save
the FPU state if it's actually used by the signal handler?
This turns it into a 'hint', not into an FPU state corruption issue.
Clearing/enabling FPU instructions is still faster than a full-blown FPU
context save/restore.
Careful and lightweight signal handlers (like a GC scheme would likely be)
would thus be faster. In the worst-case it incures an extra trap and a
(measurable/profilable) slowdown.
In any case this would be a secondary optimization - the biggest difference
i'd expect from the 'dont wake up the world' logic:
> > - SA_RUNNING: a way to signal only running threads - as a way for user-space
> > based concurrency control mechanisms to deschedule running threads (or, like
> > in your case, to implement barrier / garbage collection schemes).
>
> Hmm. This sounds less fundamentally broken, but at the same time also _way_
> more invasive in the signal handling layer. It's already one of our more
> "exciting" layers out there.
Yeah, definitely. But i still tend to think it should be actively tried, at
which point we can still say 'yuck this cannot work, lets go for the
sys_membarrier() solution'.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists