lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z4aGPI0yka6qL-dF@thinkpad>
Date: Tue, 14 Jan 2025 10:43:56 -0500
From: Yury Norov <yury.norov@...il.com>
To: I Hsin Cheng <richard120310@...il.com>
Cc: mark.rutland@....com, linux@...musvillemoes.dk, jserv@...s.ncku.edu.tw,
	linux-kernel@...r.kernel.org, visitorckw@...il.com
Subject: Re: [RFC PATCH] cpumask: Implement "random" version of
 cpumask_any_but()

> > > > @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
> > > >  static __always_inline
> > > >  unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
> > > >  {
> > > > -	unsigned int i;
> > > > +	unsigned int i, n, weight;
> > > >  
> > > >  	cpumask_check(cpu);
> > > > -	for_each_cpu(i, mask)
> > > > -		if (i != cpu)
> > > > -			break;
> > > > +	weight = cpumask_weight(mask);
> > > > +	n = get_random_u32() % weight;
> > > > +
> > > > +	/* If we accidentally pick "n" equal to "cpu",
> > > > +	 * then simply choose "n + 1"th cpu.
> > > > +	 */
> > > > +	if (n == cpu)
> > > > +		n = (n + 1) % weight;
> > > > +	i = cpumask_nth(n, mask);
> > 
> > This is an entirely broken thing, and it works only because your CPU mask
> > is dense. Imagine cpumask: 0111 1111. Your new cpumask_any_but(mask, 5)
> > will return 5 exactly, if the get_random_u32() draws 4.
> > 
> 
> Oh I'm sorry for this part, "cpumask_nth()" return value should be
> checked instead of checking "get_random_u32()". May I ask why does it
> only work for dense cpumask?

Because for dense mask cpumask_nth(n) == n, and your
 	n = get_random_u32() % weight
is the same as 
 	n = cpumask_nth(get_random_u32() % weight)

> I mean clearly original method will be
> faster when cpumask isn't dense.
> 
> > It looks broken even for a dense mask. By probability, your code returns:
> > 
> > P(0-4,7) == 1/8,
> > P(5) == 0,
> > P(6) == 1/4.
> > 
> > Assuming you are trying to implement a random uniform distribution drawing,
> > the correct probabilities should look like:
> > 
> > P(0-4,6-7) == 1/7,
> > P(5) == 0,
> > 
> > Thanks,
> > Yury
> > 
> 
> I did noticed this part, I was thinking that we do not actually need to
> make the prbability to be uniform distribution, just want to make it
> scatters more then picking certain number.

On cpumask level you can't speculate whether users need true randomness
or not. That's why we pay so much attention to proper function naming.

See: 
 - cpumask_any() - 'any' requirement is much simpler than random;
 - cpumask_any_distribute() - not random at all, just a better (good
   enough) approximation;
 - cpumask_random() - a true random drawing. Must pass Chi^2, 
   Kolmogorov-Smirnov and other fancy statistical tests. As you can see,
   doesn't exist.

If in your function you use expensive true-random get_random_u32(),
one can expect that the output will not be worse than the original
uniform distribution, by the statistical properties means.

> Thanks for your detailed review and explanation, also thanks to Mark and
> Kuan-Wei !
> 
> I now realize "random" here is more of a convention for the caller to
> states that it doesn't matter which cpu it gets, maybe we should
> rephrase the comment to make it less confusing? Because I think "random"
> itself does stands for a particular meaning.

Feel free to send a patch.

Thanks,
Yury

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ