lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 03 Feb 2015 10:34:05 +0100
From:	Rasmus Villemoes <linux@...musvillemoes.dk>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	"Wang\, Yalin" <Yalin.Wang@...ymobile.com>,
	"'Kirill A. Shutemov'" <kirill@...temov.name>,
	"'arnd\@arndb.de'" <arnd@...db.de>,
	"'linux-arch\@vger.kernel.org'" <linux-arch@...r.kernel.org>,
	"'linux-kernel\@vger.kernel.org'" <linux-kernel@...r.kernel.org>,
	"'linux\@arm.linux.org.uk'" <linux@....linux.org.uk>,
	"'linux-arm-kernel\@lists.infradead.org'" 
	<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [RFC] change non-atomic bitops method

On Tue, Feb 03 2015, Andrew Morton <akpm@...ux-foundation.org> wrote:

>
> You aren't measuring the right thing.  You should compare
>
> 	if (p[i] != x)
> 		p[i] = x;
>
> versus
>
> 	p[i] = x;
>
> and you should do this for two cases:
>
> a) p[i] == x
>
> b) p[i] != x
>
>
> The first code sequence will be slower when (p[i] != x) and faster when
> (p[i] == x).
>
>
> Next, we should instrument the kernel to work out the frequency of
> set_bit on an already-set bit.
>
> It is only with both these ratios that we can work out whether the
> patch is a net gain.  My suspicion is that set_bit on an already-set
> bit is so rare that the patch will be a loss.

There's also the code-bloat issue to consider (instruction cache and all
that); the conditional versions will usually require three extra
instructions and an extra register. Also, the cache line might already
be dirty because of something in the surrounding code. Instruction cache
misses and larger stack footprint (from larger register pressure) won't
show up in a microbenchmark, so I think this needs a real-world example
to justify.

But even if one finds some hot spot that would benefit from the
conditional, that should simply be added explicitly there, instead of
pessimizing every other user. (A good example of that is 358eec18243a
("vfs: decrapify dput(), fix cache behavior under normal load")).

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ