lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 20 Aug 2009 17:41:23 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	Anton Blanchard <anton@...ba.org>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>, balbir@...ux.vnet.ibm.com,
	Bharata B Rao <bharata@...ux.vnet.ibm.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Ingo Molnar <mingo@...e.hu>, mingo@...hat.com, hpa@...or.com,
	linux-kernel@...r.kernel.org, schwidefsky@...ibm.com,
	balajirrao@...il.com, dhaval@...ux.vnet.ibm.com,
	tglx@...utronix.de, akpm@...ux-foundation.org
Subject: [PATCH] better align percpu counter (Was Re: [tip:sched/core]
 sched: cpuacct: Use bigger percpu counter batch values for stats counters

On Thu, 20 Aug 2009 16:24:51 +1000
Anton Blanchard <anton@...ba.org> wrote:

>  
> Hi,
> 
> > Could you share contex-switch-test program ?
> > I'd like to play with it to find out what I can do against percpu counter.
> 
> Sure:
> 
> http://ozlabs.org/~anton/junkcode/context_switch.c
> 
> Very simple, just run it once per core:
> 
> for i in `seq 0 31`
> do
> 	taskset -c $i ./context_switch &
> done
> 
> Then look at the context switch rates in vmstat.
> 
Thank you for test program.

Before adjusting batch counter (I think you should modify it),
Could you try this ?

I only have 8cpu(2socket) host but works well.
(But...my host is x86-64 and has not virt-cpu-accouting.)

with your program
before patch.
cpuacct off : 414000-416000 ctsw per sec.
cpuacct on  : 401000-404000 ctsw per sec.

after patch
cpuacct on  : 412000-413000 ctsw per sec.

Maybe I should check cache-miss late ;)
==
It's bad to place pointer for array of per-cpu-data on the
same cache line of spinlock. This patch moves percpu_counter's
cacheline to reduce false sharing.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
---
 include/linux/percpu_counter.h |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Index: linux-2.6.31-rc6/include/linux/percpu_counter.h
===================================================================
--- linux-2.6.31-rc6.orig/include/linux/percpu_counter.h	2009-08-20 12:09:27.000000000 +0900
+++ linux-2.6.31-rc6/include/linux/percpu_counter.h	2009-08-20 17:31:13.000000000 +0900
@@ -14,14 +14,24 @@
 #include <linux/types.h>
 
 #ifdef CONFIG_SMP
+struct __percpu_counter_padding {
+	char x[0];
+} ____cacheline_internodealigned_in_smp;
+#define CACHELINE_PADDING(name)  struct __percpu_counter_padding name
 
 struct percpu_counter {
+	/*
+	 * This pointer is persistent and accessed firstly.
+	 * Then, should not be purged by locking in other cpus.
+	 */
+	s32 *counters;
+	CACHELINE_PADDING(pad);
 	spinlock_t lock;
 	s64 count;
 #ifdef CONFIG_HOTPLUG_CPU
+	/* rarely accessed field */
 	struct list_head list;	/* All percpu_counters are on a list */
 #endif
-	s32 *counters;
 };
 
 extern int percpu_counter_batch;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ