[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180710174225.GA3593@linux.vnet.ibm.com>
Date: Tue, 10 Jul 2018 10:42:25 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Joel Fernandes <joel@...lfernandes.org>
Cc: Joel Fernandes <joelaf@...gle.com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Daniel Colascione <dancol@...gle.com>,
Alexei Starovoitov <ast@...com>,
linux-kernel <linux-kernel@...r.kernel.org>,
Tim Murray <timmurray@...gle.com>,
Daniel Borkmann <daniel@...earbox.net>,
netdev <netdev@...r.kernel.org>, fengc@...gle.com
Subject: Re: [RFC] Add BPF_SYNCHRONIZE bpf(2) command
On Tue, Jul 10, 2018 at 10:29:57AM -0700, Joel Fernandes wrote:
> On Tue, Jul 10, 2018 at 10:12:29AM -0700, Paul E. McKenney wrote:
> [..]
> > > > > The other question I have is about the whole "nohz-full doesn't work" thing.
> > > > > I didn't fully understand why. RCU is already tracking the state of nohz-full
> > > > > CPUs because the rcu dynticks code in (kernel/rcu/tree.c) monitors
> > > > > transitions to and from usermode even if the timer tick is turned off. So why
> > > > > would it not work?
> > > >
> > > > In the nohz_full case, there is no need for sys_membarrier()'s call to
> > > > synchronize_sched() to interact directly with the nohz_full CPU. It
> > > > can instead look at the target CPU's dyntick-idle state, and that state
> > > > would potentially have been set in the dim distant past, thus having
> > > > no effect on the target CPU's current execution.
> > >
> > > In nohz-idle case though, there's nothing to promote the barrier() to
> > > smp_mb() if you were to purely look at the dynticks-idle state on the
> > > nohz-full CPU executing in user mode?
> > >
> > > So then it makes sense to me now that nohz-full needs something to IPI that
> > > CPU inorder to enforce the needed memory barrier and pure synchronize_sched()
> > > wouldn't work. So then makes me think the expedited versions of
> > > synchronize_sched should be able to do the job but I could off on a different
> > > track..
> >
> > The problem is that the expedited versions also check the dyntick-idle
> > state and don't touch idle (or nohz_full usermode) CPUs. This is by
> > design for the battery-powered embedded use case. ;-)
>
> Oh ok! ;)
>
> I guess there's also a MEMBARRIER_CMD_GLOBAL_EXPEDITED which seems to IPI
> CPUs (I'm guessing regardless of dynticks state) and execute smp_mb within
> the IPI so userspace can fallback to using that incase MEMBARRIER_CMD_GLOBAL
> returns -EINVAL.
Yes, and this avoids IPIing idle CPUs via the ->mm checks. But it will
IPI nohz_full CPUs in that same process, as it must for correctness.
Thanx, Paul
Powered by blists - more mailing lists