[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170413090211.6h5cx2q3odmz5wcj@hirez.programming.kicks-ass.net>
Date: Thu, 13 Apr 2017 11:02:11 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
Sebastian Siewior <bigeasy@...utronix.de>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
"David S. Miller" <davem@...emloft.net>,
Fenghua Yu <fenghua.yu@...el.com>,
Herbert Xu <herbert@...dor.apana.org.au>,
Lai Jiangshan <jiangshanlai@...il.com>,
Len Brown <lenb@...nel.org>,
Michael Ellerman <mpe@...erman.id.au>,
"Rafael J. Wysocki" <rjw@...ysocki.net>, Tejun Heo <tj@...nel.org>,
Tony Luck <tony.luck@...el.com>,
Viresh Kumar <viresh.kumar@...aro.org>
Subject: Re: [patch 00/13] sched/treewide: Clean up various racy task
affinity issues
On Wed, Apr 12, 2017 at 10:07:26PM +0200, Thomas Gleixner wrote:
> While dealing with the fallout of the scheduler cleanups on RT, we found
> several racy usage sites of the following scheme:
>
> cpumask_copy(&save_cpus_allowed, ¤t->cpus_allowed);
> set_cpus_allowed_ptr(current, cpumask_of(cpu));
> do_stuff();
> set_cpus_allowed_ptr(current, &save_cpus_allowed);
>
> That's racy in two aspects:
>
> 1) Nothing prevents the CPU from being unplugged after the temporary
> affinity setting is in place. This results on code being executed on the
> wrong CPU(s).
>
> 2) Nothing prevents a concurrent affinity setting from user space. That
> also results in code being executed on the wrong CPU(s) and the restore
> of the previous affinity setting overwrites the new one.
>
> Various variants of cleanups:
>
> - Removal, because the calling thread is already guaranteed to run on the
> correct CPU.
>
> - Conversion to smp function calls (simple register read/write)
>
> - Conversion to work_on_cpu(). There were even files containing comments
> to that effect.
>
> - The rest needs seperate hotplug protection for work_on_cpu(). To avoid open
> coding the
>
> get_online_cpus();
> if (cpu_online(cpu))
> ret = do_stuff();
> else
> ret = -ENODEV;
> put_online_cpus();
>
> scheme this series provides a new helper function work_on_cpu_safe()
> which implements the above.
>
> Aside of fixing these races this allows to restrict the access to
> current->cpus_allowed with a follow up series.
Looks good, thanks for tackling these!
Powered by blists - more mailing lists