[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFwaf_Wst=AS75ydBJVQ6aJxPfAzXdt-UXj3qC9WeUt7kw@mail.gmail.com>
Date: Sun, 22 Sep 2013 15:22:52 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
"H. Peter Anvin" <hpa@...or.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
Paul Mackerras <paulus@....ibm.com>,
Ingo Molnar <mingo@...nel.org>,
James Hogan <james.hogan@...tec.com>,
"James E.J. Bottomley" <jejb@...isc-linux.org>,
Helge Deller <deller@....de>,
Martin Schwidefsky <schwidefsky@...ibm.com>,
Heiko Carstens <heiko.carstens@...ibm.com>,
"David S. Miller" <davem@...emloft.net>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix
On Sun, Sep 22, 2013 at 2:56 PM, Benjamin Herrenschmidt
<benh@...nel.crashing.org> wrote:
> On Sun, 2013-09-22 at 18:24 +0200, Peter Zijlstra wrote:
>>
>> We use a segment offset. Something like:
>>
>> inc %gs:var;
>>
>
> And gcc makes no stupid assumptions that this gs doesn't change ? That's
> the main problem we have with using r13 for PACA.
Since gcc doesn't really know about segment registers at all (modulo
%fs as TLS on x86), we do everything like that using inline asm.
It's not *too* painful if you have a number of macro helpers to build
up all the different versions.
And r13 isn't volatile if you are preempt-safe, so I'm wondering if
you could just make the preempt disable code mark %r13 as modified
("+r"). Then gcc won't ever cache r13 across one of those. And if you
don't have preemption disabled, then you cannot do multiple ops using
%r13 anyway, since on a load-store architecture it might change even
between the load and store, so a per-cpu "add" operation *has* to
cache the %r13 value in *another* register anyway, because using
memory ops with just an offset off %r13 would be buggy.
So I don't think this is a gcc issue. gcc can't fix those kinds of problems.
Personally, I'd suggest something like:
- the paca stuff is just insane. Try to get rid of it.
- use %r13 for the per-thread thread-info pointer instead. A
per-thread pointer is *not* volatile like the per-cpu base is.
- Now you can make the per-cpu offset be loaded off the per-thread
pointer (update it at context switch). gcc knows to not cache it
across function calls, since it's a memory access. Use ACCESS_ONCE()
or something to make sure it's only loaded once for the cpu offset
ops.
Alternatively, make %r13 point to the percpu side, but make sure that
you always use an asm accessor to fetch the value. In particular, I
think you need to make __my_cpu_offset be an inline asm that fetches
%r13 into some other register. Otherwise you can never get it right.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists