lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2b02076d.b3d5.1963f626edb.Coremail.xavier_qy@163.com>
Date: Thu, 17 Apr 2025 00:15:37 +0800 (CST)
From: Xavier  <xavier_qy@....com>
To: "David Laight" <david.laight.linux@...il.com>
Cc: ryan.roberts@....com, dev.jain@....com, ioworker0@...il.com,
	21cnbao@...il.com, akpm@...ux-foundation.org,
	catalin.marinas@....com, david@...hat.com, gshan@...hat.com,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	will@...nel.org, willy@...radead.org, ziy@...dia.com
Subject: Re: [mm/contpte v3 1/1] mm/contpte: Optimize loop to reduce
 redundant operations




At 2025-04-16 16:57:06, "David Laight" <david.laight.linux@...il.com> wrote:
>On Tue, 15 Apr 2025 16:22:05 +0800
>Xavier <xavier_qy@....com> wrote:
>
>> This commit optimizes the contpte_ptep_get function by adding early
>>  termination logic. It checks if the dirty and young bits of orig_pte
>>  are already set and skips redundant bit-setting operations during
>>  the loop. This reduces unnecessary iterations and improves performance.
>
>Benchmarks?
>
>As has been pointed out before CONT_PTES is small and IIRC dirty+young
>is unusual.

I haven't found some suitable benchmark tests yet. I will write some more
general test scenarios. Please pay attention to the subsequent emails.

>
>> 
>> Signed-off-by: Xavier <xavier_qy@....com>
>> ---
>>  arch/arm64/mm/contpte.c | 20 ++++++++++++++++++--
>>  1 file changed, 18 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
>> index bcac4f55f9c1..0acfee604947 100644
>> --- a/arch/arm64/mm/contpte.c
>> +++ b/arch/arm64/mm/contpte.c
>> @@ -152,6 +152,16 @@ void __contpte_try_unfold(struct mm_struct *mm, unsigned long addr,
>>  }
>>  EXPORT_SYMBOL_GPL(__contpte_try_unfold);
>>  
>> +/* Note: in order to improve efficiency, using this macro will modify the
>> + * passed-in parameters.*/
>
>... this macro modifies ...
>
>But you can make it obvious my passing by reference.
>The compiler will generate the same code.
>

This part may also be further refined.

--
Thanks,
Xavier

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ