linux-kernel - Re: SLUB defrag pull request?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200810282219.44022.nickpiggin@yahoo.com.au>
Date:	Tue, 28 Oct 2008 22:19:43 +1100
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Pekka Enberg <penberg@...helsinki.fi>
Cc:	Eric Dumazet <dada1@...mosbay.com>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Miklos Szeredi <miklos@...redi.hu>, hugh@...itas.com,
	linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org
Subject: Re: SLUB defrag pull request?

On Tuesday 28 October 2008 22:06, Pekka Enberg wrote:
> On Thu, 2008-10-23 at 19:14 +0200, Eric Dumazet wrote:
> > [PATCH] slub: slab_alloc() can use prefetchw()
> >
> > Most kmalloced() areas are initialized/written right after allocation.
> >
> > prefetchw() gives a hint to cpu saying this cache line is going to be
> > *modified*, even if first access is a read.
> >
> > Some architectures can save some bus transactions, acquiring
> > the cache line in an exclusive way instead of shared one.
> >
> > Same optimization was done in 2005 on SLAB in commit
> > 34342e863c3143640c031760140d640a06c6a5f8
> > ([PATCH] mm/slab.c: prefetchw the start of new allocated objects)
> >
> > Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
>
> Christoph, I was sort of expecting a NAK/ACK from you before merging
> this. I would be nice to have numbers on this but then again I don't see
> how this can hurt either.

I've seen explicit prefetches hurt quite surprising amount if they're
not placed in appropriate places (which includes putting them in
places where the object is already in cache, or the processor is in a
good position to have speculatively initiated the operation anyway).

I'm not saying it's going to be the case here, but it can be really
hard to actually tell if it is worthwhile, IMO. For example, some
nice CPU local workloads that are often fitting within cache, might
have the object already in cache 99.x% of the time here. prefetch may
easily slow things down.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/