lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z4pwUxE23JEG5flR@vaxr-BM6660-BM6360>
Date: Fri, 17 Jan 2025 22:59:31 +0800
From: I Hsin Cheng <richard120310@...il.com>
To: Kuan-Wei Chiu <visitorckw@...il.com>
Cc: yury.norov@...il.com, linux@...musvillemoes.dk, jserv@...s.ncku.edu.tw,
	mark.rutland@....com, linux-kernel@...r.kernel.org,
	eleanor15x@...il.com
Subject: Re: [PATCH] cpumask: Optimize cpumask_any_but()

On Fri, Jan 17, 2025 at 10:26:58PM +0800, Kuan-Wei Chiu wrote:
> The cpumask_any_but() function can avoid using a loop to determine the
> CPU index to return. If the first set bit in the cpumask is not equal
> to the specified CPU, we can directly return the index of the first set
> bit. Otherwise, we return the next set bit's index.
> 
> This optimization replaces the loop with a single if statement,
> allowing the compiler to generate more concise and efficient code.
> 
> As a result, the size of the bzImage built with x86 defconfig is
> reduced by 4096 bytes:
> 
> * Before:
> $ size arch/x86/boot/bzImage
>    text    data     bss     dec     hex filename
> 13537280           1024       0 13538304         ce9400 arch/x86/boot/bzImage
> 
> * After:
> $ size arch/x86/boot/bzImage
>    text    data     bss     dec     hex filename
> 13533184           1024       0 13534208         ce8400 arch/x86/boot/bzImage
> 
> Co-developed-by: Yu-Chun Lin <eleanor15x@...il.com>
> Signed-off-by: Yu-Chun Lin <eleanor15x@...il.com>
> Signed-off-by: Kuan-Wei Chiu <visitorckw@...il.com>
> ---
> Not sure how to measure the efficiency difference, but I guess this
> patch might be slightly more efficient or nearly the same as before. If
> you have any good ideas for measuring efficiency, please let me know!
> 
>  include/linux/cpumask.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index 9278a50d514f..b769fcdbaa10 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -404,10 +404,10 @@ unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
>  	unsigned int i;
>  
>  	cpumask_check(cpu);
> -	for_each_cpu(i, mask)
> -		if (i != cpu)
> -			break;
> -	return i;
> +	i = find_first_bit(cpumask_bits(mask), small_cpumask_bits);

Hi Kuan-Wei,

How about using cpumask_first(mask) here to keep better consistency?

> +	if (i != cpu)
> +		return i;
Wouldn't it benefit abit to check "i >= nr_cpu_ids" prior to
find_next_bit() ? if "i >= nr_cpu_ids" holds it would be a waste to
perform find_next_bit().

> +	return find_next_bit(cpumask_bits(mask), small_cpumask_bits, i + 1);
>  }
>  

Regards,
I Hsin

>  /**
> -- 
> 2.34.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ