lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <adbdea88-2da6-e3c7-cb86-e75995e550f3@huaweicloud.com>
Date:   Sun, 25 Jun 2023 14:44:58 +0800
From:   Li Nan <linan666@...weicloud.com>
To:     Paul Menzel <pmenzel@...gen.mpg.de>
Cc:     song@...nel.org, linux-raid@...r.kernel.org,
        linux-kernel@...r.kernel.org, linan122@...wei.com,
        yukuai3@...wei.com, yi.zhang@...wei.com, houtao1@...wei.com,
        yangerkun@...wei.com
Subject: Re: [PATCH 1/3] md/raid10: optimize fix_read_error



在 2023/6/23 18:03, Paul Menzel 写道:
> Dear Li,
> 
> 
> Thank you for your patch.
> 
> Am 23.06.23 um 19:32 schrieb linan666@...weicloud.com:
>> From: Li Nan <linan122@...wei.com>
>>
>> We dereference r10_bio->read_slot too many times in fix_read_error().
>> Optimize it by using a variable to store read_slot.
> 
> I am always cautious reading about optimizations without any benchmarks 
> or object code analysis. Although your explanation makes sense, did you 
> check, that performance didn’t decrease in some way? (Maybe the compiler 
> even generates the same code.)
> 
> 
> Kind regards,
> 
> Paul
> 
> 

Compared assembly code before and after optimization:
  - With gcc 8.3.0, both are consistent.
  - With gcc 12.2.1, 'r10_bio->read_slot' mostly uses r10bio dirctly:
      2853    while (sl != r10_bio->read_slot) {
        0xffffffff8213f2a0 <+1328>:  cmp    %r14d,0x38(%rbp)

    'slot' is mostly saved to a register individually:
      2819    while (sl != slot) {
        0xffffffff8213f08a <+794>:   cmp    %r14d,%ebx

I have not tested the performance, and it won't bring significant 
performance optimization, which can also be seen from the analysis of 
the assembly code. In fact, I just want to make code more readable.

-- 
Thanks,
Nan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ