linux-kernel - Re: [PATCH RT] sched: migrate_enable: Busy loop until the migration request is completed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <801c5fbbd93e126b8eef7ab0e53550479059a34c.camel@redhat.com>
Date:   Wed, 22 Jan 2020 15:13:26 -0600
From:   Scott Wood <swood@...hat.com>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        linux-rt-users <linux-rt-users@...r.kernel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH RT] sched: migrate_enable: Busy loop until the migration
 request is completed

On Fri, 2019-12-13 at 09:14 +0100, Sebastian Andrzej Siewior wrote:
> On 2019-12-13 00:44:22 [-0600], Scott Wood wrote:
> > > @@ -8239,7 +8239,10 @@ void migrate_enable(void)
> > >  		stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop,
> > >  				    &arg, &work);
> > >  		__schedule(true);
> > > -		WARN_ON_ONCE(!arg.done && !work.disabled);
> > > +		if (!work.disabled) {
> > > +			while (!arg.done)
> > > +				cpu_relax();
> > > +		}
> > 
> > We should enable preemption while spinning -- besides the general
> > badness
> > of spinning with it disabled, there could be deadlock scenarios if
> > multiple CPUs are spinning in such a loop.  Long term maybe have a way
> > to
> > dequeue the no-longer-needed work instead of waiting.
> 
> Hmm. My plan was to use per-CPU memory and spin before the request is
> enqueued if the previous isn't done yet (which should not happen™).

Either it can't happen (and thus no need to spin) or it can, and we need to
worry about deadlocks if we're spinning with preemption disabled.  In fact a
deadlock is guaranteed if we're spinning with preemption disabled on the cpu
that's supposed to be running the stopper we're waiting on.

I think you're right that it can't happen though (as long as we queue it
before enabling preemption, the stopper will be runnable and nothing else
can run on the cpu before the queue gets drained), so we can just make it a 
warning.  I'm testing a patch now.

> Then we could remove __schedule() here and rely on preempt_enable()
> doing that.

We could do that regardless.

-Scott