linux-kernel - Re: [cpuops cmpxchg V2 4/5] vmstat: User per cpu atomics to avoid interrupt disable / enable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4D08F0A2.9010301@kernel.org>
Date:	Wed, 15 Dec 2010 17:45:22 +0100
From:	Tejun Heo <tj@...nel.org>
To:	Christoph Lameter <cl@...ux.com>
CC:	akpm@...ux-foundation.org, Pekka Enberg <penberg@...helsinki.fi>,
	linux-kernel@...r.kernel.org,
	Eric Dumazet <eric.dumazet@...il.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Subject: Re: [cpuops cmpxchg V2 4/5] vmstat: User per cpu atomics to avoid
 interrupt disable / enable

On 12/14/2010 05:28 PM, Christoph Lameter wrote:
> Currently the operations to increment vm counters must disable interrupts
> in order to not mess up their housekeeping of counters.
> 
> So use this_cpu_cmpxchg() to avoid the overhead. Since we can no longer
> count on preremption being disabled we still have some minor issues.
> The fetching of the counter thresholds is racy.
> A threshold from another cpu may be applied if we happen to be
> rescheduled on another cpu.  However, the following vmstat operation
> will then bring the counter again under the threshold limit.
> 
> The operations for __xxx_zone_state are not changed since the caller
> has taken care of the synchronization needs (and therefore the cycle
> count is even less than the optimized version for the irq disable case
> provided here).
> 
> The optimization using this_cpu_cmpxchg will only be used if the arch
> supports efficient this_cpu_ops (must have CONFIG_CMPXCHG_LOCAL set!)
> 
> The use of this_cpu_cmpxchg reduces the cycle count for the counter
> operations by %80 (inc_zone_page_state goes from 170 cycles to 32).
> 
> Signed-off-by: Christoph Lameter <cl@...ux.com>
>
+/*
+ * If we have cmpxchg_local support then we do not need to incur the overhead
+ * that comes with local_irq_save/restore if we use this_cpu_cmpxchg.
+ *
+ * mod_state() modifies the zone counter state through atomic per cpu
+ * operations.
+ *
+ * Overstep mode specifies how overstep should handled:
+ *     0       No overstepping
+ *     1       Overstepping half of threshold
+ *     -1      Overstepping minus half of threshold
+*/
+static inline void mod_state(struct zone *zone,
+       enum zone_stat_item item, int delta, int overstep_mode)
+{
+	struct per_cpu_pageset __percpu *pcp = zone->pageset;
+	s8 __percpu *p = pcp->vm_stat_diff + item;
+	long o, n, t, z;
+
+	do {
+		z = 0;  /* overflow to zone counters */
+
+		/*
+		 * The fetching of the stat_threshold is racy. We may apply
+		 * a counter threshold to the wrong the cpu if we get
+		 * rescheduled while executing here. However, the following
+		 * will apply the threshold again and therefore bring the
+		 * counter under the threshold.
+		 */

What does "the following" mean here?  Later executions of the
function?  It seems like the counter can go out of the threshold at
least temporarily, which probably is okay but I think the comment can
be improved a bit.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/