[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200811191531.41652.nickpiggin@yahoo.com.au>
Date: Wed, 19 Nov 2008 15:31:41 +1100
From: Nick Piggin <nickpiggin@...oo.com.au>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: David Miller <davem@...emloft.net>, mingo@...e.hu,
dada1@...mosbay.com, rjw@...k.pl, linux-kernel@...r.kernel.org,
kernel-testers@...r.kernel.org, cl@...ux-foundation.org,
efault@....de, a.p.zijlstra@...llo.nl, shemminger@...tta.com
Subject: Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -> 2.6.28
On Wednesday 19 November 2008 02:58, Linus Torvalds wrote:
> On Tue, 18 Nov 2008, Nick Piggin wrote:
> > On Tuesday 18 November 2008 07:58, David Miller wrote:
> > > From: Linus Torvalds <torvalds@...ux-foundation.org>
> > >
> > > > Ok. It could easily be something like a cache footprint issue. And
> > > > while I don't know my sparc cpu's very well, I think the
> > > > Ultrasparc-IIIi is super- scalar but does no out-of-order and
> > > > speculation, no?
> > >
> > > I does only very simple speculation, but you're description is
> > > accurate.
> >
> > Surely it would do branch prediction, but maybe not indirect branch?
>
> That would be "branch target prediction" (and a BTB - "Branch Target
> Buffer" to hold it), and no, I don't think Sparc does that. You can
> certainly do it for in-order machines too, but I think it's fairly rare.
>
> It's sufficiently different from the regular "pick up the address from the
> static instruction stream, and also yank the kill-chain on mispredicted
> direction" to be real work to do. Unlike a compare or test instruction,
> it's not at all likely that you can resolve the final address in just a
> single pipeline stage, and without that, it's usually too late to yank the
> kill-chain.
>
> (And perhaps equally importantly, indirect branches are relatively rare on
> old-style Unix benchmarks - ie SpecInt/FP - or in databases. So it's not
> something that Sparc would necessarily have spent the effort on.)
>
> There is obviously one very special indirect jump: "ret". That's the one
> that is common, and that tends to have a special branch target buffer that
> is a pure stack. And for that, there is usually a special branch target
> register that needs to be set up 'x' cycles before the ret in order to
> avoid the stall (then the predition is checking that register against the
> branch target stack, which is somewhat akin to a regular conditional
> branch comparison).
>
> So I strongly suspect that an indirect (non-ret) branch flushes the
> pipeline on sparc. It is possible that there is a "prepare to jump"
> instruction that prepares the indirect branch stack (kind of a "push
> prediction information"). I suspect Java sees a lot more indirect
> branches than traditional Unix loads, so maybe Sun did do that.
Probably true. OTOH, I've seen indirect branches get compiled to direct
branches or the common-case special cased into a direct branch
if (object->fn == default_object_fn)
default_object_fn();
That might be an easy way to test suspicions about CPU scheduler
slowdowns... (adding a likely() there, and using likely profiling would
help ensure you got the defualt case right).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists