lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210912045608.GB16216@1wt.eu>
Date:   Sun, 12 Sep 2021 06:56:09 +0200
From:   Willy Tarreau <w@....eu>
To:     Douglas Gilbert <dgilbert@...erlog.com>
Cc:     LKML <linux-kernel@...r.kernel.org>
Subject: Re: how many memset(,0,) calls in kernel ?

On Sat, Sep 11, 2021 at 11:36:07PM -0400, Douglas Gilbert wrote:
> Here is a pretty rough estimate:
> $ find . -name '*.c' -exec fgrep "memset(" {} \; > memset_in_kern.txt
> 
> $ cat memset_in_kern.txt | wc -l
>     20159
> 
> Some of those are in comments, EXPORTs, etc, but the vast majority are
> in code. Plus there will be memset()s in header files not counted by
> that find. Checking in that output file I see:
> 
> $ grep ", 0," memset_in_kern.txt | wc -l
>     18107
> $ grep ", 0" memset_in_kern.txt | wc -l
>     19349
> $ grep ", 0x" memset_in_kern.txt | wc -l
>     1210
> $ grep ", 0x01" memset_in_kern.txt | wc -l
>     3
> $ grep ", 0x0," memset_in_kern.txt | wc -l
>     199
> $ grep ",0," memset_in_kern.txt | wc -l
>     72

Note that in order to get something faster and slightly more accurate,
you can use 'git grep':

   $ git grep 'memset([^,]*,\s*0\(\|x0*\),' |wc -l
   18822

> If the BSD flavours of Unix had not given us:
>    void bzero(void *s, size_t n);
> would the Linux kernel have something similar in common usage (e.g.
> memzero() or mem0() ), that was less wasteful than the standard:
>    void *memset(void *s, int c, size_t n);
> in the extremely common case where c=0 and the return value is
> not used?

What do you mean by "wasteful" here ? What are you trying to preserve,
caracters in the source code maybe ? Because the output code is already
adapted to the context thanks to memset() being builtin. Let's take one
of the first instances I found that's easy to match against asm code:

net/core/dev.c:

  int __init netdev_boot_setup(char *str)
  {
        int ints[5];
        struct ifmap map;

        str = get_options(str, ARRAY_SIZE(ints), ints);
        if (!str || !*str)
                return 0;

        /* Save settings */
        memset(&map, 0, sizeof(map));
        ...
  }

It gives this:

  16:   e8 00 00 00 00          callq  1b <netdev_boot_setup+0x1b>
                        17: R_X86_64_PC32       get_options-0x4
  1b:   48 89 c6                mov    %rax,%rsi

note that we're zeroing %eax below in preparation for the "return 0"
statement:

  1e:   31 c0                   xor    %eax,%eax

This is the "if (!str || !*str)" :

  20:   48 85 f6                test   %rsi,%rsi
  23:   0f 84 98 00 00 00       je     c1 <netdev_boot_setup+0xc1>
  29:   80 3e 00                cmpb   $0x0,(%rsi)
  2c:   0f 84 8f 00 00 00       je     c1 <netdev_boot_setup+0xc1>

%r12 is set to &map:

  32:   4c 8d 65 d0             lea    -0x30(%rbp),%r12

And this is the memset "call" itself, which reuses the zero from
the %eax register:

  36:   b9 06 00 00 00          mov    $0x6,%ecx
  3b:   4c 89 e7                mov    %r12,%rdi
  3e:   f3 ab                   rep stos %eax,%es:(%rdi)

The last line does exactly "memset(%rdi, %eax, %ecx)". Just two bytes
for some code that modern processors are even able to optimize.

As you can see there's not much waste here in the output code, and
in fact using any dedicated function would be larger and likely
slower.

Hoping this helps,
Willy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ