lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 30 Apr 2019 12:03:18 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     "Paul E. McKenney" <paulmck@...ux.ibm.com>
Cc:     linux-kernel@...r.kernel.org, andrea.parri@...rulasolutions.com
Subject: Re: Question about sched_setaffinity()

On Sat, Apr 27, 2019 at 11:02:46AM -0700, Paul E. McKenney wrote:

> This actually passes rcutorture.  But, as Andrea noted, not klitmus.
> After some investigation, it turned out that klitmus was creating kthreads
> with PF_NO_SETAFFINITY, hence the failures.  But that prompted me to
> put checks into my code: After all, rcutorture can be fooled.
> 
> 	void synchronize_rcu(void)
> 	{
> 		int cpu;
> 
> 		for_each_online_cpu(cpu) {
> 			sched_setaffinity(current->pid, cpumask_of(cpu));
> 			WARN_ON_ONCE(raw_smp_processor_id() != cpu);
> 		}
> 	}
> 
> This triggers fairly quickly, usually in less than a minute of rcutorture
> testing.
>
> And further investigation shows that sched_setaffinity()
> always returned 0. 

> Is this expected behavior?  Is there some configuration or setup that I
> might be missing?

ISTR there is hotplug involved in RCU torture? In that case, it can be
sched_setaffinity() succeeds to place us on a CPU, which CPU hotplug
then takes away. So when we run the WARN thingy, we'll be running on a
different CPU than expected.

If OTOH, your loop is written like (as it really should be):

	void synchronize_rcu(void)
	{
		int cpu;

		cpus_read_lock();
		for_each_online_cpu(cpu) {
			sched_setaffinity(current->pid, cpumask_of(cpu));
			WARN_ON_ONCE(raw_smp_processor_id() != cpu);
		}
		cpus_read_unlock();
	}

Then I'm not entirely sure how we can return 0 and not run on the
expected CPU. If we look at __set_cpus_allowed_ptr(), the only paths out
to 0 are:

 - if the mask didn't change
 - if we already run inside the new mask
 - if we migrated ourself with the stop-task
 - if we're not in fact running

That last case should never trigger in your circumstances, since @p ==
current and current is obviously running. But for completeness, the
wakeup of @p would do the task placement in that case.

Powered by blists - more mailing lists