[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111116044032.GA26476@bloggs.ozlabs.ibm.com>
Date: Wed, 16 Nov 2011 15:40:32 +1100
From: Paul Mackerras <paulus@...ba.org>
To: "Moffett, Kyle D" <Kyle.D.Moffett@...ing.com>
Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>,
"B04825@...escale.com" <B04825@...escale.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"paul.gortmaker@...driver.com" <paul.gortmaker@...driver.com>,
"scottwood@...escale.com" <scottwood@...escale.com>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: [RFC PATCH 0/2] powerpc: CPU cache op cleanup
On Tue, Nov 15, 2011 at 04:45:18PM -0600, Moffett, Kyle D wrote:
> On Nov 15, 2011, at 17:29, Benjamin Herrenschmidt wrote:
> > On Mon, 2011-11-14 at 21:32 -0500, Kyle Moffett wrote:
> >> Unfortunately, I've been staring at PPC asm for long enough that I
> >> have a migraine headache and I'm going to have to stop here for now.
> >> If somebody else wants to tackle fixing up the 32-bit copy_page() and
> >> __copy_tofrom_user() routines it would be highly appreciated.
> >
> > Yeah that's the one everybody's avoiding :-)
> >
> > What about my idea of instead compiling it multiple times with a
> > different size and fixing up the branch to call the right one ?
>
> I guess that's doable, although I have to admit that idea almost gives
> me more of a headache than trying to fix up the 32-bit ASM.
>
> One thing that bothers me in particular is that both 32/64 versions of
> __copy_tofrom_user() are dramatically overcomplicated for what they
> ought to be doing.
>
> It would seem that if we get a page fault during an unaligned copy, we
> ought to just give up and fall back to a simple byte-by-byte copy loop
> from wherever we left off. That would eliminate 90% of the ugly
> special cases without actually hurting performance, right?
That's basically what we do, IIRC, and most of the complexity comes
from working out where we were up to. We could probably use a simpler
approximation that means we might copy some bytes twice. In fact the
greatest simplification would probably be to implement range entries
in the exception table so we can just have one entry for all the loads
and stores instead of an entry for each individual load and store.
> For a page-fault during a cacheline-aligned copy, we should be able to
> handle the exception and retry from the last cacheline without much
> logic, again with good performance.
>
> With that said, I'm curious about the origin of the PPC32 ASM. In
> particular, it looks like it was generated by GCC at some point in the
> distant past, and I'm wondering if there's a good way to rewrite that
> file in C and trick GCC into generating the relevant exception tables
> for it?
Why do you think it was generated by gcc? I wrote the original
version, but I think it got extended and macro-ized by others.
Paul.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists