lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090112223421.GA20594@localdomain>
Date:	Mon, 12 Jan 2009 14:34:21 -0800
From:	Ravikiran G Thirumalai <kiran@...lex86.org>
To:	Frederik Deweerdt <frederik.deweerdt@...og.eu>
Cc:	mingo@...e.hu, andi@...stfloor.org, tglx@...utronix.de,
	hpa@...or.com, linux-kernel@...r.kernel.org
Subject: Re: [patch] tlb flush_data: replace per_cpu with an array

Hi Frederik,

On Mon, Jan 12, 2009 at 10:35:42PM +0100, Frederik Deweerdt wrote:
>
>Signed-off-by: Frederik Deweerdt <frederik.deweerdt@...og.eu>
>
>diff --git a/arch/x86/kernel/tlb_64.c b/arch/x86/kernel/tlb_64.c
>index f8be6f1..c177a1f 100644
>--- a/arch/x86/kernel/tlb_64.c
>+++ b/arch/x86/kernel/tlb_64.c
>@@ -33,7 +33,7 @@
>  *	To avoid global state use 8 different call vectors.
>  *	Each CPU uses a specific vector to trigger flushes on other
>  *	CPUs. Depending on the received vector the target CPUs look into
>- *	the right per cpu variable for the flush data.
>+ *	the right array slot for the flush data.
>  *
>  *	With more than 8 CPUs they are hashed to the 8 available
>  *	vectors. The limited global vector space forces us to this right now.
>@@ -54,7 +54,7 @@ union smp_flush_state {
> /* State is put into the per CPU data section, but padded
>    to a full cache line because other CPUs can access it and we don't
>    want false sharing in the per cpu data segment. */
>-static DEFINE_PER_CPU(union smp_flush_state, flush_state);
>+static union smp_flush_state flush_state[NUM_INVALIDATE_TLB_VECTORS];

Since flush_state has a spinlock, an array of flush_states would end up
having multiples spinlocks on the same L1 cacheline.  However, I see that
the union smp_flush_state is already padded to SMP_CACHE_BYTES.

With per-cpu areas,  locks belonging to different array elements did
not end up on the same internode cacheline due to per-node allocation of
per-cpu areas, but with a simple array, two locks could end up on the same
internode cacheline.

Hence, can you please change the padding in smp_flush_state to
INTERNODE_CACHE_BYTES (you have to derive this off INTERNODE_CACHE_SHIFT)
and align it using ____cacheline_internodealigned_in_smp?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ