[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5072F497.2000703@intel.com>
Date: Mon, 08 Oct 2012 08:43:19 -0700
From: Alexander Duyck <alexander.h.duyck@...el.com>
To: Andi Kleen <andi@...stfloor.org>
CC: konrad.wilk@...cle.com, tglx@...utronix.de, mingo@...hat.com,
hpa@...or.com, rob@...dley.net, akpm@...ux-foundation.org,
joerg.roedel@....com, bhelgaas@...gle.com, shuahkhan@...il.com,
linux-kernel@...r.kernel.org, devel@...uxdriverproject.org,
x86@...nel.org, torvalds@...ux-foundation.org
Subject: Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical
addresses
On 10/06/2012 10:57 AM, Andi Kleen wrote:
>> Inlining everything did speed things up a bit, but I still didn't reach
>> the same speed I achieved using the patch set. However I did notice the
>> resulting swiotlb code was considerably larger.
> Thanks. So your patch makes sense, but imho should pursue the inlining
> in parallel for other call sites.
I'll try to take a look at getting that done this morning.
>> assembly, is replaced with 8 lines of assembly and becomes inline. In
>> addition we drop the number of calls to __phys_addr from 9 to 2 by
>> dropping them all from swiotlb. By my math I am probably saving about
>> 120 instructions per packet. I suspect all of that would probably be
>> cutting the number of instructions per packet enough to probably account
>> for a 5% difference when you consider I am running at about 1.5Mpps per
>> core on a 2.7Ghz processor.
> Maybe it's just me, but that's somehow sad for one if() and a subtraction
Well there is also all of the setup of the call on the function stack.
By my count just the portion that is used in the standard case is about
9 lines of assembly. By inlining it and dropping the if case we can
probably drop it to 1.
> BTW __pa used to be a simple subtraction, the if () was just added to
> handle the few call sites for x86-64 that do __pa(&text_symbol).
> Maybe we should just go back to the old __pa_symbol() for those cases,
> then __pa could be the simple subtraction it used to was again
> and it could be inlined and everyone would be happy.
>
> -Andi
What I am probably looking at doing is splitting the function in two as
you suggest where we have a separate function for the text symbol case.
I will probably also take the 32 bit approach and add a debug version
that is still a separate function for uses such as determining if we
have any callers who should be using __pa_symbol instead of __pa.
Thanks,
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists