[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <p73vecx21um.fsf@bingen.suse.de>
Date: 06 Jul 2007 13:44:33 +0200
From: Andi Kleen <andi@...stfloor.org>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
Alexey Dobriyan <adobriyan@...il.com>,
linux-kernel@...r.kernel.org
Subject: Re: [patch 10/10] Scheduler profiling - Use immediate values
Andrew Morton <akpm@...ux-foundation.org> writes:
> Is that 48 cycles measured when the target of the read is in L1 cache, as
> it would be in any situation which we actually care about? I guess so...
The normal situation is big database or other bloated software runs; clears
all the dcaches, then enters kernel. Kernel has a cache miss on all its data.
But icache access is faster because the CPU prefetches.
We've had cases like this -- e.g. the additional dcache line
accesses that were added by the new time code in vgettimeofday()
were visible in macro benchmarks.
Also cache misses in this situation tend to be much more than 48 cycles
(even an K8 with integrated memory controller with fastest DIMMs is
slower than that) Mathieu probably measured an L2 miss, not a load from RAM.
Load from RAM can be hundreds of ns in the worst case.
I think the optimization is a good idea, although i dislike it
that it is complicated for the dynamic markers. If it was just
static it would be much simpler.
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists