lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5ee86b3b-81d2-790c-f67b-e250f60272fd@loongson.cn>
Date:   Wed, 17 Mar 2021 08:26:23 +0800
From:   Tiezhu Yang <yangtiezhu@...ngson.cn>
To:     David Laight <David.Laight@...LAB.COM>,
        "Maciej W. Rozycki" <macro@...am.me.uk>
Cc:     Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
        "linux-mips@...r.kernel.org" <linux-mips@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Xuefeng Li <lixuefeng@...ngson.cn>
Subject: Re: [PATCH v2] MIPS: Check __clang__ to avoid performance influence
 with GCC in csum_tcpudp_nofold()

On 03/15/2021 08:42 PM, David Laight wrote:
> From: Tiezhu Yang <yangtiezhu@...ngson.cn>
>> Sent: 15 March 2021 12:26
>> On 03/15/2021 04:49 AM, Maciej W. Rozycki wrote:
>>> On Tue, 9 Mar 2021, Tiezhu Yang wrote:
>>>
>>>> diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
>>>> index 1e6c135..80eddd4 100644
>>>> --- a/arch/mips/include/asm/checksum.h
>>>> +++ b/arch/mips/include/asm/checksum.h
>>>> @@ -128,9 +128,13 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl)
>>>>
>>>>    static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr,
>>>>    					__u32 len, __u8 proto,
>>>> -					__wsum sum)
>>>> +					__wsum sum_in)
>>>>    {
>>>> -	unsigned long tmp = (__force unsigned long)sum;
>>>> +#ifdef __clang__
>>>> +	unsigned long sum = (__force unsigned long)sum_in;
>>>> +#else
>>>> +	__wsum sum = sum_in;
>>>> +#endif
>>>    This looks much better to me, but I'd keep the variable names unchanged
>>> as `sum_in' isn't used beyond the initial assignment anyway (you'll have
>>> to update the references with asm operands accordingly of course).
>>>
>>>    Have you verified that code produced by GCC remains the same with your
>>> change in place as it used to be up to commit 198688edbf77?  I can see no
>>> such information in the commit description whether here or in the said
>>> commit.
>>>
>>>     Maciej
>> Hi Maciej,
>>
>> Thanks for your reply.
>>
>> gcc --version
>> gcc (Debian 10.2.1-6) 10.2.1 20210110
>>
>> net/ipv4/tcp_ipv4.c
>> tcp_v4_send_reset()
>>     csum_tcpudp_nofold()
>>
>> objdump -d vmlinux > vmlinux.dump
>>
>> (1) Before commit 198688edbf77
>> ("MIPS: Fix inline asm input/output type mismatch in checksum.h used
>> with Clang"):
>>
>> ffffffff80aa835c:       00004025        move    a4,zero
>> ffffffff80aa8360:       92020012        lbu     v0,18(s0)
>> ffffffff80aa8364:       de140030        ld      s4,48(s0)
>> ffffffff80aa8368:       0064182d        daddu   v1,v1,a0
>> ffffffff80aa836c:       304200ff        andi    v0,v0,0xff
>> ffffffff80aa8370:       9c64000c        lwu     a0,12(v1)
>> ffffffff80aa8374:       9c660010        lwu     a2,16(v1)
>> ffffffff80aa8378:       afa70038        sw      a3,56(sp)
>> ffffffff80aa837c:       24071a00        li      a3,6656
>> ffffffff80aa8380:       0086202d        daddu   a0,a0,a2
>> ffffffff80aa8384:       0087202d        daddu   a0,a0,a3
>> ffffffff80aa8388:       0088202d        daddu   a0,a0,a4
>> ffffffff80aa838c:       0004083c        dsll32  at,a0,0x0
>> ffffffff80aa8390:       0081202d        daddu   a0,a0,at
>> ffffffff80aa8394:       0081082b        sltu    at,a0,at
>> ffffffff80aa8398:       0004203f        dsra32  a0,a0,0x0
>> ffffffff80aa839c:       00812021        addu    a0,a0,at
>>
>> (2) After commit 198688edbf77
>> ("MIPS: Fix inline asm input/output type mismatch in checksum.h used
>> with Clang"):
>>
>> ffffffff80aa836c:       00004025        move    a4,zero
>> ffffffff80aa8370:       92040012        lbu     a0,18(s0)
>> ffffffff80aa8374:       de140030        ld      s4,48(s0)
>> ffffffff80aa8378:       0062182d        daddu   v1,v1,v0
>> ffffffff80aa837c:       308400ff        andi    a0,a0,0xff
>> ffffffff80aa8380:       9c62000c        lwu     v0,12(v1)
>> ffffffff80aa8384:       9c660010        lwu     a2,16(v1)
>> ffffffff80aa8388:       afa70038        sw      a3,56(sp)
>> ffffffff80aa838c:       24071a00        li      a3,6656
>> ffffffff80aa8390:       0046102d        daddu   v0,v0,a2
>> ffffffff80aa8394:       0047102d        daddu   v0,v0,a3
>> ffffffff80aa8398:       0048102d        daddu   v0,v0,a4
>> ffffffff80aa839c:       0002083c        dsll32  at,v0,0x0
>> ffffffff80aa83a0:       0041102d        daddu   v0,v0,at
>> ffffffff80aa83a4:       0041082b        sltu    at,v0,at
>> ffffffff80aa83a8:       0002103f        dsra32  v0,v0,0x0
>> ffffffff80aa83ac:       00411021        addu    v0,v0,at
>>
>> (3) With this patch:
>>
>> ffffffff80aa835c:       00004025        move    a4,zero
>> ffffffff80aa8360:       92020012        lbu     v0,18(s0)
>> ffffffff80aa8364:       de140030        ld      s4,48(s0)
>> ffffffff80aa8368:       0064182d        daddu   v1,v1,a0
>> ffffffff80aa836c:       304200ff        andi    v0,v0,0xff
>> ffffffff80aa8370:       9c64000c        lwu     a0,12(v1)
>> ffffffff80aa8374:       9c660010        lwu     a2,16(v1)
>> ffffffff80aa8378:       afa70038        sw      a3,56(sp)
>> ffffffff80aa837c:       24071a00        li      a3,6656
>> ffffffff80aa8380:       0086202d        daddu   a0,a0,a2
>> ffffffff80aa8384:       0087202d        daddu   a0,a0,a3
>> ffffffff80aa8388:       0088202d        daddu   a0,a0,a4
>> ffffffff80aa838c:       0004083c        dsll32  at,a0,0x0
>> ffffffff80aa8390:       0081202d        daddu   a0,a0,at
>> ffffffff80aa8394:       0081082b        sltu    at,a0,at
>> ffffffff80aa8398:       0004203f        dsra32  a0,a0,0x0
>> ffffffff80aa839c:       00812021        addu    a0,a0,at
>>
>> (4) With the following changes based on commit 198688edbf77
>> ("MIPS: Fix inline asm input/output type mismatch in checksum.h used
>> with Clang"):
>>
>> diff --git a/arch/mips/include/asm/checksum.h
>> b/arch/mips/include/asm/checksum.h
>> index 1e6c135..e1f80407 100644
>> --- a/arch/mips/include/asm/checksum.h
>> +++ b/arch/mips/include/asm/checksum.h
>> @@ -130,7 +130,11 @@ static inline __wsum csum_tcpudp_nofold(__be32
>> saddr, __be32 daddr,
>>                        __u32 len, __u8 proto,
>>                        __wsum sum)
>>    {
>> +#ifdef __clang__
>>        unsigned long tmp = (__force unsigned long)sum;
>> +#else
>> +    __wsum tmp = sum;
>> +#endif
>>
>>        __asm__(
>>        "    .set    push        # csum_tcpudp_nofold\n"
>>
>> ffffffff80aa835c:       00004025        move    a4,zero
>> ffffffff80aa8360:       92020012        lbu     v0,18(s0)
>> ffffffff80aa8364:       de140030        ld      s4,48(s0)
>> ffffffff80aa8368:       0064182d        daddu   v1,v1,a0
>> ffffffff80aa836c:       304200ff        andi    v0,v0,0xff
>> ffffffff80aa8370:       9c64000c        lwu     a0,12(v1)
>> ffffffff80aa8374:       9c660010        lwu     a2,16(v1)
>> ffffffff80aa8378:       afa70038        sw      a3,56(sp)
>> ffffffff80aa837c:       24071a00        li      a3,6656
>> ffffffff80aa8380:       0086202d        daddu   a0,a0,a2
>> ffffffff80aa8384:       0087202d        daddu   a0,a0,a3
>> ffffffff80aa8388:       0088202d        daddu   a0,a0,a4
>> ffffffff80aa838c:       0004083c        dsll32  at,a0,0x0
>> ffffffff80aa8390:       0081202d        daddu   a0,a0,at
>> ffffffff80aa8394:       0081082b        sltu    at,a0,at
>> ffffffff80aa8398:       0004203f        dsra32  a0,a0,0x0
>> ffffffff80aa839c:       00812021        addu    a0,a0,at
>>
>> The code produced by GCC remains the same between (1), (3) and (4),
>> the last changes looks like better (with less changes based on commit
>> 198688edbf77), so I will send v3 later.
> Aren't those all the same - apart from register selection.
> Not that I grok the mips opcodes.
> But that code has horridness on its side.
>
> The only obvious difference is that something else changes the
> code offset from xxxx5c to xxxx6c.
>
> 	David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)

Hi David,

Yes, it seems no much obvious differences.
Let me wait for other feedback.

Hi Thomas and Maciej,

Is this patch necessary? If no, we can ignore it.
If yes, I will send v3 with the above (4) changes.

Thanks,
Tiezhu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ