linux-kernel - Re: [PATCH] LoongArch: Make -mstrict-align be configurable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <85c36350-34d2-d333-8e47-255914d3fdaa@loongson.cn>
Date:   Tue, 7 Feb 2023 09:13:38 +0800
From:   Jianmin Lv <lvjianmin@...ngson.cn>
To:     Arnd Bergmann <arnd@...db.de>, Xi Ruoyao <xry111@...111.site>,
        WANG Xuerui <kernel@...0n.name>,
        Huacai Chen <chenhuacai@...ngson.cn>,
        Huacai Chen <chenhuacai@...nel.org>
Cc:     loongarch@...ts.linux.dev, Linux-Arch <linux-arch@...r.kernel.org>,
        Xuefeng Li <lixuefeng@...ngson.cn>, guoren <guoren@...nel.org>,
        Jiaxun Yang <jiaxun.yang@...goat.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] LoongArch: Make -mstrict-align be configurable



On 2023/2/6 下午9:22, Arnd Bergmann wrote:
> On Mon, Feb 6, 2023, at 14:13, Jianmin Lv wrote:
>> On 2023/2/6 下午7:18, Xi Ruoyao wrote:
>>> On Mon, 2023-02-06 at 18:24 +0800, Jianmin Lv wrote:
>>>> Hi, Xuerui
>>>>
>>>> I think the kernels produced with and without -mstrict-align have mainly
>>>> following differences:
>>>> - Diffirent size. I build two kernls (vmlinux), size of kernel with
>>>> -mstrict-align is 26533376 bytes and size of kernel without
>>>> -mstrict-align is 26123280 bytes.
>>>> - Diffirent performance. For example, in kernel function jhash(), the
>>>> assemble code slices with and without -mstrict-align are following:
>>>
>>> But there are still questions remaining:
>>>
>>> (1) Is the difference contributed by a bad code generation of GCC?  If
>>> true, it's better to improve GCC before someone starts to build a distro
>>> for LA264 as it would benefit the user space as well.
>>>
>> AFAIK, GCC builds to produce unaligned-access-enabled target binary by
>> default (without -mstrict-align) for improving user space performance
>> (small size and runtime high performance), which is also based the fact
>> that the vast majority of LoongArch CPUs support unaligned-access.
>>
>>> (2) Is there some "big bad unaligned access loop" on a hot spot in the
>>> kernel code?  If true, it may be better to just refactor the C code
>>> because doing so will benefit all ports, not only LoongArch.  Otherwise,
>>> it may be unworthy to optimize for some cold paths.
>>>
>> Frankly, I'm not sure if there is this kind of hot code in kernel, I
>> just see the difference from different kernel size and different
>> assemble code slice. And I'm afraid that it may be difficult to judge
>> whether it is reasonable hot code or not if exists.
> 
> Just look for CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, this will
> show you code locations that use different implementations based on
> whether the kernel should run on CPUs without unaligned access or
> not.
> 
>        Arnd
> 

Got it, thank you very much, I greped 
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS and found many matched cases 
including driver, lib, net and so on, it seems that it's reasonable to 
use high performance way for CPUs with HAVE_EFFICIENT_UNALIGNED_ACCESS 
configured.