lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7682977b-5929-890a-3a18-662fbfcede5c@gmail.com>
Date:   Tue, 24 May 2022 09:38:27 -0700
From:   Florian Fainelli <f.fainelli@...il.com>
To:     "Maciej W. Rozycki" <macro@...am.me.uk>,
        Thomas Bogendoerfer <tsbogend@...ha.franken.de>
Cc:     Paul Cercueil <paul@...pouillou.net>,
        Nathan Chancellor <nathan@...nel.org>,
        Tiezhu Yang <yangtiezhu@...ngson.cn>,
        linux-mips@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] MIPS: Rewrite `csum_tcpudp_nofold' in plain C

On 5/22/22 13:48, Maciej W. Rozycki wrote:
> Recent commit 198688edbf77 ("MIPS: Fix inline asm input/output type
> mismatch in checksum.h used with Clang") introduced a code size and
> performance regression with 64-bit code emitted for `csum_tcpudp_nofold'
> by GCC, caused by a redundant truncation operation produced due to a
> data type change made to the variable associated with the inline
> assembly's output operand.
> 
> The intent previously expressed here with operands and constraints for
> optimal code was to have the output operand share a register with one
> inputs, both of a different integer type each.  This is perfectly valid
> with the MIPS psABI where a register can hold integer data of different
> types and the assembly code used here makes data stored in the output
> register match the data type used with the output operand, however it
> has turned out impossible to express this arrangement in source code
> such as to satisfy LLVM, apparently due to the compiler's internal
> limitations.
> 
> There is nothing peculiar about the inline assembly `csum_tcpudp_nofold'
> includes however, though it does choose assembly instructions carefully.
> 
> Rewrite this piece of assembly in plain C then, using corresponding C
> language operations, making GCC produce the same assembly instructions,
> possibly shuffled, in the general case and sometimes actually fewer of
> them where an input is constant, because the compiler does not have to
> reload it to a register (operand constraints could be adjusted for that,
> but the plain C approach is cleaner anyway).
> 
> Example code size changes are as follows, for a 32-bit configuration:
> 
>        text       data        bss      total filename
>     5920480    1347236     126592    7394308 vmlinux-old
>     5920480    1347236     126592    7394308 vmlinux-now
>     5919728    1347236     126592    7393556 vmlinux-c
> 
> and for a 64-bit configuration:
> 
>        text       data        bss      total filename
>     6024112    1790828     225728    8040668 vmlinux-old
>     6024128    1790828     225728    8040684 vmlinux-now
>     6023760    1790828     225728    8040316 vmlinux-c
> 
> respectively, where "old" is with the commit referred reverted, "now" is
> with no change, and "c" is with this change applied.
> 
> Signed-off-by: Maciej W. Rozycki <macro@...am.me.uk>
> ---
> Hi,
> 
>   I have visually inspected code produced and verified this change to boot
> with TCP networking performing just fine, both with a 32-bit and a 64-bit
> configuration.  Sadly with the little endianness only, because in the
> course of this verification I have discovered the core card of my Malta
> board bit the dust a few days ago, apparently in a permanent manner, and I
> have no other big-endian MIPS system available here to try.

How about QEMU is not that a viable option for testing big/little endian 
configurations?
-- 
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ