lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100326173730.GA27489@pendragon.3leafnetworks.com>
Date:	Fri, 26 Mar 2010 10:37:30 -0700
From:	Scott Lurndal <scott.lurndal@...afsystems.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	David Howells <dhowells@...hat.com>, mingo@...e.hu,
	tglx@...utronix.de, linux-arch@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] X86: Optimise fls(), ffs() and fls64()

On Fri, Mar 26, 2010 at 10:23:46AM -0700, Linus Torvalds wrote:
> 
> 
> On Fri, 26 Mar 2010, David Howells wrote:
> >
> > fls(N), ffs(N) and fls64(N) can be optimised on x86/x86_64.  Currently they
> > perform checks against N being 0 before invoking the BSR/BSF instruction, or
> > use a CMOV instruction afterwards.  Either the check involves a conditional
> > jump which we'd like to avoid, or a CMOV, which we'd also quite like to avoid.
> > 
> > Instead, we can make use of the fact that BSR/BSF doesn't modify its output
> > register if its input is 0.  By preloading the output with -1 and incrementing
> > the result, we achieve the desired result without the need for a conditional
> > check.
> 
> This is totally incorrect.
> 
> Where did you find that "doesn't modify its output" thing? It's not true. 
> The truth is that the destination is undefined. Just read the dang Intel 
> documentation, it's very clearly stated right there.

While this is true for the current (253666-031US) Intel documentation,
the AMD documentation (rev 3.14) for the same instruction states that the
destination register is unchanged (as opposed to Intel's undefined).

I wonder if Intel's EM64 stuff makes this more deterministic, perhaps
David's implementation would work for x86_64 only?

scott
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ