[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20070530095007.GA24087@c3po.0xdef.net>
Date: Wed, 30 May 2007 11:50:07 +0200
From: Hagen Paul Pfeifer <hagen@...u.net>
To: linux-kernel@...r.kernel.org
Cc: Ingo Molnar <mingo@...e.hu>,
Hagen Paul Pfeifer <pfeifer@...lab.nec.de>
Subject: [PATCH] INTERNODE_CACHE_SHIFT redefinition for hot spot structures
I followed code in include/linux/mmzone.h and attract attention to the padding
construct for struct zone:
struct zone_padding {
char x[0];
} ____cacheline_internodealigned_in_smp;
The definition for ____cacheline_internodealigned_in_smp
is definded as __attribute__((__aligned__(1 << (INTERNODE_CACHE_SHIFT))))
INTERNODE_CACHE_SHIFT in turn is defined as L1_CACHE_SHIFT for all
architectures (expect x86_64) as L1_CACHE_SHIFT - which is not that what
everybody should expected if they use ____cacheline_internodealigned_in_smp!
The comment stated out: "The maximum alignment needed for some critical
structures. These could be inter-node cacheline sizes/L3 cacheline size etc.
Define this in asm/cache.h for your arch"
If a architecture redefine INTERNODE_CACHE_SHIFT (as x86_64) this patch has no
effects at all. All other architecture will profit from a better cache line
utilization.
The superior solution is to utilize the gcc 4.2 alternative to use the cpuid
instruction (which is new in gcc release 4.2) to gather CPU attributes at
compile time and replace the static
#define L1_CACHE_BYTES (1 << L1_CACHE_SHIFT)
macros for at least amd/intel processors- but that is another thought ...
(Which _user_ knows the l1/l2 cache line size of the current CPU if they are
different for different models (e.g P4)? If this fails -> fall back to standard
size). But as I said - this is a idea.
See gcc/config/i386/driver-i386.c:host_detect_local_cpu() for more information
in the gcc devel branch.
Salaam
Hagen Paul Pfeifer
Increase (double) the definition for INTERNODE_CACHE_SHIFT to accommodate a
higher cache line size for l2 and l3 cacheline sizes - compared to l1 cache
line size. The introduced overhead for architectures where the l1 cache line
size correspond to the l2 cache line size is evanescent small, cause
INTERNODE_CACHE_SHIFT is used only for hot spots.
Signed-off-by: Hagen Paul Pfeifer <hagen@...u.net>
---
include/linux/cache.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
View attachment "46130e6afdad1c3df698bde8b86ab89568a3f6d9.diff" of type "text/x-patch" (417 bytes)
Powered by blists - more mailing lists