[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <951669027.771.1567544027663.JavaMail.zimbra@efficios.com>
Date: Tue, 3 Sep 2019 16:53:47 -0400 (EDT)
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: paulmck <paulmck@...ux.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Oleg Nesterov <oleg@...hat.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
"Russell King, ARM Linux" <linux@...linux.org.uk>,
Chris Metcalf <cmetcalf@...hip.com>,
Chris Lameter <cl@...ux.com>, Kirill Tkhai <tkhai@...dex.ru>,
Mike Galbraith <efault@....de>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>
Subject: Re: [RFC PATCH 1/2] Fix: sched/membarrier: p->mm->membarrier_state
racy load
----- On Sep 3, 2019, at 4:27 PM, Linus Torvalds torvalds@...ux-foundation.org wrote:
> On Tue, Sep 3, 2019 at 1:11 PM Mathieu Desnoyers
> <mathieu.desnoyers@...icios.com> wrote:
>>
>> + cpus_read_lock();
>> + for_each_online_cpu(cpu) {
>
> This would likely be better off using mm_cpumask(mm) instead of all
> online CPU's.
I've considered using mm_cpumask(mm) in the original implementation of
the membarrier expedited private command, and chose to stick to online
cpu mask instead.
Here was my off-list justification to Peter Zijlstra and Paul E. McKenney:
If we have an iteration on mm_cpumask in the membarrier code,
then we additionally need to document that memory barriers are
required before and/or after all updates to the mm_cpumask, otherwise
I think we end up in the same situation as with the rq->curr update.
[...]
So we'd be sprinkling even more memory barrier comments all over.
Considering the amount of comments that needed to be added around the
scheduler rq->curr update for membarrier, I'm concerned that the amount
of additional analysis, documentation, and design constraints required
to safely use mm_cpumask() from membarrier is not really worth it
compared to iterating on online cpus with cpu hotplug read lock held.
>
> Plus doing the rcu_read_lock() inside the loop seems pointless. Even
> with a lot of cores, it's not going to loop _that_ many times for RCU
> latency to be an issue.
Good point! I'll keep that in mind for next round if we don't chose an
entirely different way forward.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Powered by blists - more mailing lists