linux-kernel - Re: [patch v3 2/3] percpu_counter: use atomic64 for counter in SMP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Tue, 17 May 2011 11:02:32 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Shaohua Li <shaohua.li@...el.com>
Cc:	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	tj@...nel.org, cl@...ux.com
Subject: Re: [patch v3 2/3] percpu_counter: use atomic64 for counter in SMP

Le mardi 17 mai 2011 à 16:41 +0800, Shaohua Li a écrit :
> pièce jointe document texte brut (percpu-counter-atomic.patch)
> Uses atomic64 for percpu_counter, because it is cheaper than spinlock.
> This doesn't slow fast path (percpu_counter_read). atomic64_read
> equals to read fbc->count for 64-bit system, or equals to
> spin_lock-read-spin_unlock for 32-bit system. Note, originally
> the percpu_counter_read for 32-bit system doesn't hold spin_lock,
> but that is buggy and might cause very wrong value accessed. This
> patch fixes the issue.
> 
> We use sum_start and add_start to make sure _sum doesn't see deviation
> when _add slow path is running. When _sum is running, _add will wait
> _sum finish. This is scaring that _add is slow down, but actually not,
> because _sum is called very rare. We could make _sum waits _add finish,
> but since _add is called frequently, this will make _sum run very slow.
> 
> This can also improve some workloads with percpu_counter->lock heavily
> contented. For example, vm_committed_as sometimes causes the contention.
> We should tune the batch count, but if we can make percpu_counter better,
> why not? In a 24 CPUs system and 24 processes, each runs:
> while (1) {
> 	mmap(128M);
> 	munmap(128M);
> }
> we then measure how many loops each process can take:
> The atomic method gives 4x faster.
> 
> Signed-off-by: Shaohua Li <shaohua.li@...el.com>
> --

I NACK this patch, its not necessary, since percpu_counter doesnt
provide a precise count api anyway.

Please resubmit your original patches, without the bloat.

Thanks


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/