[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A0007C2-1C2C-4162-98E8-ACCA4E673AFE@boeing.com>
Date: Mon, 14 Nov 2011 20:36:48 -0600
From: "Moffett, Kyle D" <Kyle.D.Moffett@...ing.com>
To: Benjamin Herrenschmidt <benh@...nel.crashing.org>
CC: "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
Kumar Gala <galak@...nel.crashing.org>,
Scott Wood <scottwood@...escale.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Timur Tabi <B04825@...escale.com>,
Paul Gortmaker <paul.gortmaker@...driver.com>
Subject: Re: [RFC PATCH 00/17] powerpc/e500: separate e500 from e500mc
On Nov 10, 2011, at 23:40, Benjamin Herrenschmidt wrote:
> On Thu, 2011-11-10 at 18:38 -0600, Moffett, Kyle D wrote:
>> (2) Make the ppc64_caches struct apply to ppc32 as well, and
>> preinitialize it with a minimum value used by any platform being
>> compiled in (for "dcbXX"/"icbXX" purposes). This is safe because
>> the pagesize is always a multiple of the cache block size and the
>> kernel only uses dcbXX/icbXX on whole pages. The only impact is a
>> temporary small performance hit from flushing or zeroing the same
>> block 8 times if too small.
>
> Are you sure about dcbz ? Getting that wrong can be deadly ... I'd
> rather get rid of some fancy optims and use a soft value in some cases.
> That or we can compile multiple variants for the common case of some of
> the copy routines and use patching (alternate sections) to branch to the
> right one at runtime, at least for the common cases (32 and 128 for
> example for 440 and 476).
Well, all of the kernel loops that use dcbz are operating on whole pages,
and the PPC Book-E spec documents that the pagesize is an even multiple
of the cacheline size and the cachelines are always page-aligned.
So when you are clearing a whole page, there are only 2 things you can do
wrong with "dcbz":
(1) Call "dcbz" with an address outside of the page you want to zero.
(2) Omit calls "dcbz" to dcbz for some physical cachelines in the page.
Now, that's a totally different story from the userspace memset() calls
that caused the problem originally, because they were frequently given
memory much smaller than a page to clear, and if you didn't know exactly
how many bytes a "dcbz" was going to clear you couldn't use it at all.
But the kernel doesn't do that anywhere, it just uses it for page clears.
Cheers,
Kyle Moffett
--
Curious about my work on the Debian powerpcspe port?
I'm keeping a blog here: http://pureperl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists