lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 10 Oct 2022 10:03:53 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Willy Tarreau' <w@....eu>, Alexey Dobriyan <adobriyan@...il.com>
CC:     "lkp@...el.com" <lkp@...el.com>,
        "linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Paul E. McKenney" <paulmck@...nel.org>
Subject: RE: tools/nolibc: fix missing strlen() definition and infinite loop
 with gcc-12

From: Willy Tarreau <w@....eu>
> Sent: 09 October 2022 19:36
...
> By the way, just for the sake of completeness, the one that consistently
> gives me a better output is this one:
> 
>   size_t strlen(const char *str)
>   {
>           const char *s0 = str--;
> 
>           while (*++str)
>   		;
>           return str - s0;
>   }
> 
> Which gives me this:
> 
> 
>   0000000000000000 <strlen>:
>      0:   48 8d 47 ff             lea    -0x1(%rdi),%rax
>      4:   48 ff c0                inc    %rax
>      7:   80 38 00                cmpb   $0x0,(%rax)
>      a:   75 f8                   jne    4 <len+0x4>
>      c:   48 29 f8                sub    %rdi,%rax
>      f:   c3                      ret
> 
> But this is totally ruined by the addition of asm() in the loop. However
> I suspect that the construct is difficult to match against a real strlen()
> since it starts on an extra character, thus placing the asm() statement
> before the loop could durably preserve it. It does work here (the code
> remains the exact same one), but for how long, that's the question. Maybe
> we can revisit the various loop-based functions in the future with this in
> mind.

clang wilfully and persistently generates:

strlen:                                 # @strlen
        movq    $-1, %rax
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        cmpb    $0, 1(%rdi,%rax)
        leaq    1(%rax), %rax
        jne     .LBB0_1
        retq

But feed the C for that into gcc and it generates a 'jmp strlen'
at everything above -O1.
I suspect that might run with less clocks/byte than the code above.

Somewhere I hate some complier pessimisations.
Substituting a call to strlen() is typical.
strlen() is almost certainly optimised for long strings.
If the string is short the coded loop will be faster.
The same is true (and probably more so) for memcpy.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ