lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ec88d9a2-272e-973e-55ce-f3d411aafcd1@rasmusvillemoes.dk>
Date:   Mon, 1 Apr 2019 22:43:13 +0200
From:   Rasmus Villemoes <linux@...musvillemoes.dk>
To:     Sultan Alsawaf <sultan@...neltoast.com>
Cc:     akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
        Nathan Chancellor <natechancellor@...il.com>
Subject: Re: [RFCv2] string: Use faster alternatives when constant arguments
 are used

On 30/03/2019 23.59, Sultan Alsawaf wrote:
> How can the memcmps cross a page boundary when memcmp itself will
> only read in large buffers of data at word boundaries?

Consider your patch replacing !strcmp(buf, "123") by !memcmp(buf, "123",
4). buf is known to point to a nul-terminated string. But it may point
at, say, the second-last byte in a page, with the last byte in that page
being a nul byte, and the following page being MMIO or unmapped or all
kinds of bad things. On e.g. x86 where unaligned accesses are cheap, and
seeing that you're only comparing for equality, gcc is likely to compile
the memcmp version into

  *(u32*)buf == 0x00333231

because you've told the compiler that there's no problem accessing four
bytes starting at buf. Boom. Even without unaligned access being cheap
this can happen; suppose the length is 8 instead, and gcc somehow knows
that buf is four-byte aligned (and in this case it happens to point four
bytes before a page boundary), so it could compile the memcmp(,,8) into

  *(u32*)(buf+4) == secondword && *(u32*)buf == firstword

(or do the comparisons in the "natural" order, but it might still do
both loads first).

> And if there are concerns for some arches but not others, then couldn't this be
> a feasible optimization for those which would work well with it?

No. First, these are concerns for all arches. Second, if you can find
some particular place where string parsing/matching is in any way
performance relevant and not just done once during driver init or
whatnot, maybe the maintainers of that file would take a patch
hand-optimizing some strcmps to memcmps, or, depending on what the code
does, perhaps replacing the whole *cmp logic with a custom hash table.

But a patch implicitly and silently touching thousands of lines of code,
without an analysis of why none of the above is a problem for any of
those lines, for any .config, arch, compiler version? No.

Rasmus


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ