[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3b17d229-bad4-e6a0-9055-c585dd5a62e4@loongson.cn>
Date: Mon, 6 Feb 2023 21:13:22 +0800
From: Jianmin Lv <lvjianmin@...ngson.cn>
To: Xi Ruoyao <xry111@...111.site>, WANG Xuerui <kernel@...0n.name>,
Huacai Chen <chenhuacai@...ngson.cn>,
Arnd Bergmann <arnd@...db.de>,
Huacai Chen <chenhuacai@...nel.org>
Cc: loongarch@...ts.linux.dev, linux-arch@...r.kernel.org,
Xuefeng Li <lixuefeng@...ngson.cn>,
Guo Ren <guoren@...nel.org>,
Jiaxun Yang <jiaxun.yang@...goat.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] LoongArch: Make -mstrict-align be configurable
On 2023/2/6 下午7:18, Xi Ruoyao wrote:
> On Mon, 2023-02-06 at 18:24 +0800, Jianmin Lv wrote:
>> Hi, Xuerui
>>
>> I think the kernels produced with and without -mstrict-align have mainly
>> following differences:
>> - Diffirent size. I build two kernls (vmlinux), size of kernel with
>> -mstrict-align is 26533376 bytes and size of kernel without
>> -mstrict-align is 26123280 bytes.
>> - Diffirent performance. For example, in kernel function jhash(), the
>> assemble code slices with and without -mstrict-align are following:
>
> But there are still questions remaining:
>
> (1) Is the difference contributed by a bad code generation of GCC? If
> true, it's better to improve GCC before someone starts to build a distro
> for LA264 as it would benefit the user space as well.
>
AFAIK, GCC builds to produce unaligned-access-enabled target binary by
default (without -mstrict-align) for improving user space performance
(small size and runtime high performance), which is also based the fact
that the vast majority of LoongArch CPUs support unaligned-access.
> (2) Is there some "big bad unaligned access loop" on a hot spot in the
> kernel code? If true, it may be better to just refactor the C code
> because doing so will benefit all ports, not only LoongArch. Otherwise,
> it may be unworthy to optimize for some cold paths.
>
Frankly, I'm not sure if there is this kind of hot code in kernel, I
just see the difference from different kernel size and different
assemble code slice. And I'm afraid that it may be difficult to judge
whether it is reasonable hot code or not if exists.
Powered by blists - more mailing lists