lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZatlggW/8SH6od9O@fedora>
Date: Sat, 20 Jan 2024 14:17:38 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Yury Norov <yury.norov@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
	Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
	Breno Leitao <leitao@...ian.org>,
	Nathan Chancellor <nathan@...nel.org>,
	Rasmus Villemoes <linux@...musvillemoes.dk>,
	Zi Yan <ziy@...dia.com>, ming.lei@...hat.com
Subject: Re: [PATCH 4/9] lib/group_cpus: optimize outer loop in
 grp_spread_init_one()

On Sat, Jan 20, 2024 at 11:51:58AM +0800, Ming Lei wrote:
> On Fri, Jan 19, 2024 at 06:50:48PM -0800, Yury Norov wrote:
> > Similarly to the inner loop, in the outer loop we can use for_each_cpu()
> > macro, and skip CPUs that have been moved.
> > 
> > With this patch, the function becomes O(1), despite that it's a
> > double-loop.
> > 
> > While here, add a comment why we can't merge outer logic into the inner
> > loop.
> > 
> > Signed-off-by: Yury Norov <yury.norov@...il.com>
> > ---
> >  lib/group_cpus.c | 14 ++++++++------
> >  1 file changed, 8 insertions(+), 6 deletions(-)
> > 
> > diff --git a/lib/group_cpus.c b/lib/group_cpus.c
> > index 0a8ac7cb1a5d..952aac9eaa81 100644
> > --- a/lib/group_cpus.c
> > +++ b/lib/group_cpus.c
> > @@ -17,16 +17,17 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
> >  	const struct cpumask *siblmsk;
> >  	int cpu, sibl;
> >  
> > -	for ( ; cpus_per_grp > 0; ) {
> > -		cpu = cpumask_first(nmsk);
> > -
> > -		/* Should not happen, but I'm too lazy to think about it */
> > -		if (cpu >= nr_cpu_ids)
> > +	for_each_cpu(cpu, nmsk) {
> > +		if (cpus_per_grp-- == 0)
> >  			return;
> >  
> > +		/*
> > +		 * If a caller wants to spread IRQa on offline CPUs, we need to
> > +		 * take care of it explicitly because those offline CPUS are not
> > +		 * included in siblings cpumask.
> > +		 */
> >  		__cpumask_clear_cpu(cpu, nmsk);
> >  		__cpumask_set_cpu(cpu, irqmsk);
> > -		cpus_per_grp--;
> >  
> >  		/* If the cpu has siblings, use them first */
> >  		siblmsk = topology_sibling_cpumask(cpu);
> > @@ -38,6 +39,7 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
> >  
> >  			__cpumask_clear_cpu(sibl, nmsk);
> >  			__cpumask_set_cpu(sibl, irqmsk);
> > +			cpu = sibl + 1;
> 
> It has been tricky enough to update condition variable of for_each_cpu()
> (such kind of pattern can't build in Rust at all), and the above line could
> be more tricky actually.

Not only the above line is tricky, but also it is wrong, because 'cpu'
local variable should always point to the 1st bit in 'nmsk'. However, if
you set it to 'sibl + 1', some bits in 'nmsk' are skipped in the loop,
aren't they?


Thanks,
Ming


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ