[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <929ee551-e7ed-4dbc-9c9a-b2b02585a960@quicinc.com>
Date: Fri, 24 Jan 2025 10:41:41 +0800
From: Ziqi Chen <quic_ziqichen@...cinc.com>
To: Bart Van Assche <bvanassche@....org>, <quic_cang@...cinc.com>,
<mani@...nel.org>, <beanhuo@...ron.com>, <avri.altman@....com>,
<junwoo80.lee@...sung.com>, <martin.petersen@...cle.com>,
<quic_nguyenb@...cinc.com>, <quic_nitirawa@...cinc.com>,
<quic_rampraka@...cinc.com>
CC: <linux-arm-msm@...r.kernel.org>, <linux-scsi@...r.kernel.org>,
Alim Akhtar
<alim.akhtar@...sung.com>,
"James E.J. Bottomley"
<James.Bottomley@...senPartnership.com>,
Peter Wang
<peter.wang@...iatek.com>,
Manivannan Sadhasivam
<manivannan.sadhasivam@...aro.org>,
Andrew Halaney <ahalaney@...hat.com>,
Maramaina Naresh <quic_mnaresh@...cinc.com>,
open list
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 5/8] scsi: ufs: core: Enable multi-level gear scaling
On 1/24/2025 2:02 AM, Bart Van Assche wrote:
> On 1/22/25 11:41 PM, Ziqi Chen wrote:
>> We use memcpy() here is due to memcpy() can be faster than direct
>> assignment. We don't worry about safety because they are same struct
>> "ufs_pa_layer_attr" so that we can ensure the accuracy of number of
>> bytes and member type.
>
> The memcpy() call we are discussing is not in the hot path so it doesn't
> have to be hyper-optimized. Making the compiler perform type checking is
> more important in this code path than micro-optimizing the code.
>
> Additionally, please do not try to be smarter than the compiler.
> Compilers are able to convert struct assignments into a memcpy() call if
> there are good reasons to assume that the memcpy() call will be faster.
>
> Given the small size of struct ufs_pa_layer_attr (7 * 4 = 28 bytes),
> memberwise assignment probably is faster than a memcpy() call. The trunk
> version of gcc (ARM64) translates a memberwise assignment of struct
> ufs_pa_layer_attr into the following four assembler instructions (x0 and
> x1 point to struct ufs_pa_layer_attr instances, q30 and q31 are 128 bit
> registers):
>
> ldr q30, [x1]
> ldr q31, [x1, 12]
> str q30, [x0]
> str q31, [x0, 12]
>
> Thanks,
>
> Bart.
>
Sure , Let me try and test it. If works fine , I will update in next
version.
-Ziqi
Powered by blists - more mailing lists