lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87386ayv8o.fsf@rasmusvillemoes.dk>
Date:	Fri, 13 Feb 2015 11:13:43 +0100
From:	Rasmus Villemoes <linux@...musvillemoes.dk>
To:	"George Spelvin" <linux@...izon.com>
Cc:	akpm@...ux-foundation.org, chris@...is-wilson.co.uk,
	davem@...emloft.net, dborkman@...hat.com,
	hannes@...essinduktion.org, klimov.linux@...il.com,
	laijs@...fujitsu.com, linux-kernel@...r.kernel.org,
	msalter@...hat.com, takahiro.akashi@...aro.org, tgraf@...g.ch,
	valentinrothberg@...il.com, yury.norov@...il.com
Subject: Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

On Fri, Feb 13 2015, "George Spelvin" <linux@...izon.com> wrote:

>> the main loop is 20--3b. The test instruction at 2e seems to be
>> redundant. The same at 37: the sub instruction already sets plenty of
>> flags that could be used, so explicitly comparing %rbx to -1 seems
>> redundant.
>
> Er... I think you hand-edited that code; it's wrong.  The loop assumes that
> %rbx is in units of words, but the prologue sets it up in units of bits.

No, but I messed up the source by hand :-) My DIV_ROUND_UP macro was
bogus. Well spotted. Fixing that I still see the redundant cmp and
test, though.

> The mov to %rcx is also redundant, since it could be eliminated with
> some minor rescheduling.
>
> The code generation I *want* for that function is:
>
> # addr in %rdi, size in %rsi
> 	movl	%esi, %ecx
> 	leaq	0x3f(%rsi), %rax
> 	negl	%ecx
> 	movq	$-1, %rdx
>         shrq	$6, %rax
> 	shrq	%cl, %rdx
> 	jmp	2f
> 1:
> 	movq	$-1, %rdx
> 2:
> 	subq	$1, %rax
> 	jc	3f
> 	andq	(%rdi,%rax,8), %rdx
> 	jeq	1b
>
> 	bsrq	%rdx, %rdx
>         salq    $6, %rax
> 	addq	%rdx, %rax
>         ret
> 3:
> 	movq	%rsi, %rax
> 	retq

Nice. But I don't think find_last_bit is important enough to warrant
arch-specific versions.

So, where are we with this? Have we reached some kind of consensus?

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ