lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1305622952.2850.23.camel@edumazet-laptop>
Date:	Tue, 17 May 2011 11:02:32 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Shaohua Li <shaohua.li@...el.com>
Cc:	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	tj@...nel.org, cl@...ux.com
Subject: Re: [patch v3 2/3] percpu_counter: use atomic64 for counter in SMP

Le mardi 17 mai 2011 à 16:41 +0800, Shaohua Li a écrit :
> pièce jointe document texte brut (percpu-counter-atomic.patch)
> Uses atomic64 for percpu_counter, because it is cheaper than spinlock.
> This doesn't slow fast path (percpu_counter_read). atomic64_read
> equals to read fbc->count for 64-bit system, or equals to
> spin_lock-read-spin_unlock for 32-bit system. Note, originally
> the percpu_counter_read for 32-bit system doesn't hold spin_lock,
> but that is buggy and might cause very wrong value accessed. This
> patch fixes the issue.
> 
> We use sum_start and add_start to make sure _sum doesn't see deviation
> when _add slow path is running. When _sum is running, _add will wait
> _sum finish. This is scaring that _add is slow down, but actually not,
> because _sum is called very rare. We could make _sum waits _add finish,
> but since _add is called frequently, this will make _sum run very slow.
> 
> This can also improve some workloads with percpu_counter->lock heavily
> contented. For example, vm_committed_as sometimes causes the contention.
> We should tune the batch count, but if we can make percpu_counter better,
> why not? In a 24 CPUs system and 24 processes, each runs:
> while (1) {
> 	mmap(128M);
> 	munmap(128M);
> }
> we then measure how many loops each process can take:
> The atomic method gives 4x faster.
> 
> Signed-off-by: Shaohua Li <shaohua.li@...el.com>
> --

I NACK this patch, its not necessary, since percpu_counter doesnt
provide a precise count api anyway.

Please resubmit your original patches, without the bloat.

Thanks


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ