[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <adbdea88-2da6-e3c7-cb86-e75995e550f3@huaweicloud.com>
Date: Sun, 25 Jun 2023 14:44:58 +0800
From: Li Nan <linan666@...weicloud.com>
To: Paul Menzel <pmenzel@...gen.mpg.de>
Cc: song@...nel.org, linux-raid@...r.kernel.org,
linux-kernel@...r.kernel.org, linan122@...wei.com,
yukuai3@...wei.com, yi.zhang@...wei.com, houtao1@...wei.com,
yangerkun@...wei.com
Subject: Re: [PATCH 1/3] md/raid10: optimize fix_read_error
在 2023/6/23 18:03, Paul Menzel 写道:
> Dear Li,
>
>
> Thank you for your patch.
>
> Am 23.06.23 um 19:32 schrieb linan666@...weicloud.com:
>> From: Li Nan <linan122@...wei.com>
>>
>> We dereference r10_bio->read_slot too many times in fix_read_error().
>> Optimize it by using a variable to store read_slot.
>
> I am always cautious reading about optimizations without any benchmarks
> or object code analysis. Although your explanation makes sense, did you
> check, that performance didn’t decrease in some way? (Maybe the compiler
> even generates the same code.)
>
>
> Kind regards,
>
> Paul
>
>
Compared assembly code before and after optimization:
- With gcc 8.3.0, both are consistent.
- With gcc 12.2.1, 'r10_bio->read_slot' mostly uses r10bio dirctly:
2853 while (sl != r10_bio->read_slot) {
0xffffffff8213f2a0 <+1328>: cmp %r14d,0x38(%rbp)
'slot' is mostly saved to a register individually:
2819 while (sl != slot) {
0xffffffff8213f08a <+794>: cmp %r14d,%ebx
I have not tested the performance, and it won't bring significant
performance optimization, which can also be seen from the analysis of
the assembly code. In fact, I just want to make code more readable.
--
Thanks,
Nan
Powered by blists - more mailing lists