lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 21 Dec 2017 12:22:27 +0000
From:   Ard Biesheuvel <ard.biesheuvel@...aro.org>
To:     Arnd Bergmann <arnd@...db.de>
Cc:     Jakub Jelinek <jakub@...hat.com>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        James Morris <james.l.morris@...cle.com>,
        Richard Biener <rguenther@...e.de>,
        Jakub Jelinek <jakub@....gnu.org>,
        "David S. Miller" <davem@...emloft.net>,
        linux-crypto@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] [RFT] crypto: aes-generic - turn off -ftree-pre and -ftree-sra

On 21 December 2017 at 10:20, Arnd Bergmann <arnd@...db.de> wrote:
> On Wed, Dec 20, 2017 at 10:46 PM, Jakub Jelinek <jakub@...hat.com> wrote:
>> On Wed, Dec 20, 2017 at 09:52:05PM +0100, Arnd Bergmann wrote:
>>> diff --git a/crypto/aes_generic.c b/crypto/aes_generic.c
>>> index ca554d57d01e..35f973ba9878 100644
>>> --- a/crypto/aes_generic.c
>>> +++ b/crypto/aes_generic.c
>>> @@ -1331,6 +1331,20 @@ EXPORT_SYMBOL_GPL(crypto_aes_set_key);
>>>       f_rl(bo, bi, 3, k);     \
>>>  } while (0)
>>>
>>> +#if __GNUC__ >= 7
>>> +/*
>>> + * Newer compilers try to optimize integer arithmetic more aggressively,
>>> + * which generally improves code quality a lot, but in this specific case
>>> + * ends up hurting more than it helps, in some configurations drastically
>>> + * so. This turns off two optimization steps that have been shown to
>>> + * lead to rather badly optimized code with gcc-7.
>>> + *
>>> + * See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
>>> + */
>>> +#pragma GCC optimize("-fno-tree-pre")
>>> +#pragma GCC optimize("-fno-tree-sra")
>>
>> So do it only when UBSAN is enabled?  GCC doesn't have a particular
>> predefined macro for those (only for asan and tsan), but either the kernel
>> does have something already, or could have something added in the
>> corresponding Makefile.
>
> My original interpretation of the resulting object code suggested that disabling
> those two optimizations produced better results for this particular
> file even without
> UBSAN, on both gcc-7 and gcc-8 (but not gcc-6), so my patch might have
> been better, but I did some measurements now as Ard suggested, showing
> cycles/byte for AES256/CBC with 8KB blocks:
>
>
>                default      ubsan         patched        patched+ubsan
> gcc-4.3.6        14.9        ----           14.9         ----
> gcc-4.6.4        15.0        ----           15.8         ----
> gcc-4.9.4        15.5        20.7           15.9         20.9
> gcc-5.5.0        15.6        47.3           86.4         48.8
> gcc-6.3.1        14.6        49.4           94.3         50.9
> gcc-7.1.1        13.5        54.6           15.2         52.0
> gcc-7.2.1        16.8       124.7           92.0         52.2
> gcc-8.0.0        15.0      no boot          15.3        no boot
>
> I checked that there are actually three significant digits on the measurements,
> detailed output is available at https://pastebin.com/eFsWYjQp
>
> It seems that I was wrong about the interpretation that disabling
> the optimization would be a win on gcc-7 and higher, it almost
> always makes things worse even with UBSAN. Making that
> check "#if __GNUC__ == 7 && IS_ENABLED(CONFIG_UBSAN_SANITIZE_ALL)"
> would help here, or we could list the file as an exception for
> UBSAN and never sanitize it.
>
> Looking at the 'default' column, I wonder if anyone would be interested
> in looking at why the throughput regressed with gcc-7.2 and gcc-8.
>

Thanks for the elaborate benchmarks. Looking at the bugzilla entry, it
appears the UBSAN code inserts runtime checks to ensure that certain
u8 variables don't assume values <0 or >255, which seems like a rather
pointless exercise to me. But even if it didn't, I think it is
justified to disable UBSAN on all of the low-level cipher
implementations, given that they are hardly ever modified, and
typically don't suffer from the issues UBSAN tries to identify.

So my vote is to disable UBSAN for all such cipher implementations:
aes_generic, but also aes_ti, which has a similar 256 byte lookup
table [although it does not seem to be affected by the same issue as
aes_generic], and possibly others as well.

Perhaps it makes sense to move core cipher code into a separate
sub-directory, and disable UBSAN at the directory level?

It would involve the following files

crypto/aes_generic.c
crypto/aes_ti.c
crypto/anubis.c
crypto/arc4.c
crypto/blowfish_generic.c
crypto/camellia_generic.c
crypto/cast5_generic.c
crypto/cast6_generic.c
crypto/des_generic.c
crypto/fcrypt.c
crypto/khazad.c
crypto/seed.c
crypto/serpent_generic.c
crypto/tea.c
crypto/twofish_generic.c

Powered by blists - more mailing lists