linux-kernel - Re: Workqueues splat due to ending up on wrong CPU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191202201338.GH16681@devbig004.ftw2.facebook.com>
Date:   Mon, 2 Dec 2019 12:13:38 -0800
From:   Tejun Heo <tj@...nel.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     jiangshanlai@...il.com, linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: Workqueues splat due to ending up on wrong CPU

Hello, Paul.

(cc'ing scheduler folks - workqueue rescuer is very occassionally
triggering a warning which says that it isn't on the cpu it should be
on under rcu cpu hotplug torture test.  It's checking smp_processor_id
is the expected one after a successful set_cpus_allowed_ptr() call.)

On Sun, Dec 01, 2019 at 05:55:48PM -0800, Paul E. McKenney wrote:
> > And hyperthreading seems to have done the trick!  One splat thus far,
> > shown below.  The run should complete this evening, Pacific Time.
> 
> That was the only one for that run, but another 24*56-hour run got three
> more.  All of them expected to be on CPU 0 (which never goes offline, so
> why?) and the "XXX" diagnostic never did print.

Heh, I didn't expect that, so maybe set_cpus_allowed_ptr() is
returning 0 while not migrating the rescuer task to the target cpu for
some reason?

The rescuer is always calling to migrate itself, so it must always be
running.  set_cpus_allowed_ptr() migrates live ones by calling
stop_one_cpu() which schedules a migration function which runs from a
highpri task on the target cpu.  Please take a look at the following.

  static bool cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work)
  {
          ...
	  enabled = stopper->enabled;
	  if (enabled)
		  __cpu_stop_queue_work(stopper, work, &wakeq);
	  else if (work->done)
		  cpu_stop_signal_done(work->done);
          ...
  }

So, if stopper->enabled is clear, it'll signal completion without
running the work.  stopper->enabled is cleared during cpu hotunplug
and restored from bringup_cpu() while cpu is being brought back up.

  static int bringup_wait_for_ap(unsigned int cpu)
  {
          ...
	  stop_machine_unpark(cpu);
          ....
  }

  static int bringup_cpu(unsigned int cpu)
  {
	  ...
	  ret = __cpu_up(cpu, idle);
          ...
	  return bringup_wait_for_ap(cpu);
  }

__cpu_up() is what marks the cpu online and once the cpu is online,
kthreads are free to migrate into the cpu, so it looks like there's a
brief window where a cpu is marked online but the stopper thread is
still disabled meaning that a kthread may schedule into the cpu but
not out of it, which would explain the symptom that you were seeing.

This makes the cpumask and the cpu the task is actually on disagree
and retries would become noops.  I can work around it by excluding
rescuer attachments against hotplugs but this looks like a genuine cpu
hotplug bug.

It could be that I'm misreading the code.  What do you guys think?

Thanks.

-- 
tejun