lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 14 Nov 2011 20:36:48 -0600
From:	"Moffett, Kyle D" <Kyle.D.Moffett@...ing.com>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
CC:	"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
	Kumar Gala <galak@...nel.crashing.org>,
	Scott Wood <scottwood@...escale.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Timur Tabi <B04825@...escale.com>,
	Paul Gortmaker <paul.gortmaker@...driver.com>
Subject: Re: [RFC PATCH 00/17] powerpc/e500: separate e500 from e500mc

On Nov 10, 2011, at 23:40, Benjamin Herrenschmidt wrote:
> On Thu, 2011-11-10 at 18:38 -0600, Moffett, Kyle D wrote:
>>  (2) Make the ppc64_caches struct apply to ppc32 as well, and
>>      preinitialize it with a minimum value used by any platform being
>>      compiled in (for "dcbXX"/"icbXX" purposes).  This is safe because
>>      the pagesize is always a multiple of the cache block size and the
>>      kernel only uses dcbXX/icbXX on whole pages.  The only impact is a
>>      temporary small performance hit from flushing or zeroing the same
>>      block 8 times if too small.
> 
> Are you sure about dcbz ? Getting that wrong can be deadly ... I'd
> rather get rid of some fancy optims and use a soft value in some cases.
> That or we can compile multiple variants for the common case of some of
> the copy routines and use patching (alternate sections) to branch to the
> right one at runtime, at least for the common cases (32 and 128 for
> example for 440 and 476).

Well, all of the kernel loops that use dcbz are operating on whole pages,
and the PPC Book-E spec documents that the pagesize is an even multiple
of the cacheline size and the cachelines are always page-aligned.

So when you are clearing a whole page, there are only 2 things you can do
wrong with "dcbz":

  (1) Call "dcbz" with an address outside of the page you want to zero.

  (2) Omit calls "dcbz" to dcbz for some physical cachelines in the page.

Now, that's a totally different story from the userspace memset() calls
that caused the problem originally, because they were frequently given
memory much smaller than a page to clear, and if you didn't know exactly
how many bytes a "dcbz" was going to clear you couldn't use it at all.

But the kernel doesn't do that anywhere, it just uses it for page clears.

Cheers,
Kyle Moffett

--
Curious about my work on the Debian powerpcspe port?
I'm keeping a blog here: http://pureperl.blogspot.com/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists