lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 3 Feb 2010 19:14:25 +0100
From:	Borislav Petkov <bp@...64.org>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Wu Fengguang <fengguang.wu@...el.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Jamie Lokier <jamie@...reable.org>,
	Roland Dreier <rdreier@...co.com>,
	Al Viro <viro@...IV.linux.org.uk>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH 2/5] bitops: compile time optimization for
 hweight_long(CONSTANT)

On Wed, Feb 03, 2010 at 07:42:51AM -0800, Andrew Morton wrote:
> We didn't deal with it on every architecture, which is something which
> the compiler extension takes care of.
> 
> In fact I can't find anywhere where we dealt with it on x86.

Yeah, we talked briefly about using hardware popcnt, see thread
beginning at

http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-06/msg00245.html

for example. I did an ftrace of the cpumask_weight() calls in sched.c to
see whether there would be a measurable performance gain but it didn't
seem so at the time. My numbers said something like ca. 170 hweight
calls per second and since the <lib/hweight.c> implementations roughly
translate to something like ~20 isns (hweight64 to about ~30), the whole
thing wasn't worth the trouble considering checking binutils versions
and slapping opcodes or using gcc intrinsics which involves gcc version
checking.

An alternatives solution which is based on CPUID flag could add the
popcnt opcode without checking any toolchain versions but how is the
replaced instruction going to look like? Something like

alternative("call hweightXX", "popcnt", X86_FEATURE_POPCNT)

by making sure the arg is in some register first?

Hmm..

-- 
Regards/Gruss,
Boris.

--
Advanced Micro Devices, Inc.
Operating Systems Research Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ