linux-kernel - Re: [PATCH v2 2/3] microblaze: Do loop unrolling for optimized memset implementation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9e6f88c2-a868-71fe-227f-054bab0429d8@xilinx.com>
Date:   Mon, 28 Feb 2022 07:38:15 +0100
From:   Michal Simek <michal.simek@...inx.com>
To:     David Laight <David.Laight@...LAB.COM>,
        'Michal Simek' <michal.simek@...inx.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "monstr@...str.eu" <monstr@...str.eu>,
        "git@...inx.com" <git@...inx.com>
CC:     Mahesh Bodapati <mbodapat@...inx.com>,
        Randy Dunlap <rdunlap@...radead.org>
Subject: Re: [PATCH v2 2/3] microblaze: Do loop unrolling for optimized memset
 implementation



On 2/25/22 22:50, David Laight wrote:
> From: Michal Simek
>> Sent: 25 February 2022 13:56
>>
>> Align implementation with memcpy and memmove where also remaining bytes are
>> copied via final switch case instead of using simple implementations which
>> loop. But this alignment has much stronger reason and definitely aligning
>> implementation is not the key point here. It is just good to have in mind
>> that the same technique is used already there.
>>
>> In GCC 10, now -ftree-loop-distribute-patterns optimization is on at O2.
>> This optimization causes GCC to convert the while loop in memset.c into a
>> call to memset.
> 
> Gah...
> That is nearly as brain dead as another compiler that would convert
> any byte copy loop (on x86) into 'rep movsb'.
> 
> If I want to call memcpy() I'll call memcpy.
> If I'm copying a few bytes I might write the loop to avoid
> the cost of the call and all the conditional tests for
> buffer length and alignment.
> 
> Don't the compiler writers have better things to do?

Not sure what you want me to say about it. It is current gcc behavior and I 
can't see the way back. I don't think doing loop unrolling here is a big deal 
for me because the same technique is used for years in memcpy and memmove.

Thanks,
Michal