[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130214105027.GB25282@gmail.com>
Date: Thu, 14 Feb 2013 11:50:27 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "H. Peter Anvin" <hpa@...or.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Rik van Riel <riel@...hat.com>, rostedt@...dmiss.org,
aquini@...hat.com, Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Michel Lespinasse <walken@...gle.com>,
linux-tip-commits@...r.kernel.org
Subject: Re: [tip:core/locking] x86/smp: Move waiting on contended ticket
lock out of line
* Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Wed, Feb 13, 2013 at 8:20 AM, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > Adding an external function call is *horrible*, and you
> > might almost as well just uninline the spinlock entirely if
> > you do this. It means that all the small callers now have
> > their registers trashed, whether the unlikely function call
> > is taken or not, and now leaf functions aren't leaves any
> > more.
>
> Btw, we've had things like this before, and I wonder if we
> could perhaps introduce the notion of a "light-weight call"
> for fastpath code that calls unlikely slow-path code..
>
> In particular, see the out-of-line code used by the rwlocks
> etc (see "arch_read_lock()" for an example in
> arch/x86/include/asm/spinlock.h and arch/x86/lib/rwlock.S),
> where we end up calling things from inline asm, with one big
> reason being exactly the fact that a "normal" C call has such
> horribly detrimental effects on the caller.
>
> Sadly, gcc doesn't seem to allow specifying which registers
> are clobbered any easy way, which means that both the caller
> and the callee *both* tend to need to have some asm interface.
> So we bothered to do this for __read_lock_failed, but we have
> *not* bothered to do the same for the otherwise very similar
> __mutex_fastpath_lock() case, for example.
At least on x86, how about saving *all* volatile registers in
the slow out of line code path (to stack)?
That means we wouldn't have to do anything fancy with the called
functions, and the caller would see minimal register impact. It
would also be reasonably robust and straightforward assembly
code.
It blows up the slow path somewhat, but it would allow us to
keep the fast-path register impact even smaller - as the slow
path would only have memory content side effects.
Am I missing something?
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists