[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20170831170035.GC26273@arm.com>
Date: Thu, 31 Aug 2017 18:00:35 +0100
From: Will Deacon <will.deacon@....com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Andy Lutomirski <luto@...capital.net>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Boqun Feng <boqun.feng@...il.com>,
Andrew Hunter <ahh@...gle.com>,
maged michael <maged.michael@...il.com>,
gromer <gromer@...gle.com>, Avi Kivity <avi@...lladb.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
Dave Watson <davejwatson@...com>,
Andy Lutomirski <luto@...nel.org>,
Hans Boehm <hboehm@...gle.com>
Subject: Re: [PATCH v2] membarrier: provide register sync core cmd
On Mon, Aug 28, 2017 at 03:05:46AM +0000, Mathieu Desnoyers wrote:
> ----- On Aug 27, 2017, at 3:53 PM, Andy Lutomirski luto@...capital.net wrote:
>
> >> On Aug 27, 2017, at 1:50 PM, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
> >> wrote:
> >>
> >> Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
> >> system call. It allows processes to register their intent to have their
> >> threads issue core serializing barriers in addition to memory barriers
> >> whenever a membarrier command is performed.
> >>
> >
> > Why is this stateful? That is, why not just have a new membarrier command to
> > sync every thread's icache?
>
> If we'd do it on every CPU icache, it would be as trivial as you say. The
> concern here is sending IPIs only to CPUs running threads that belong to the
> same process, so we don't disturb unrelated processes.
>
> If we could just grab each CPU's runqueue lock, it would be fairly simple
> to do. But we want to avoid hitting each runqueue with exclusive atomic
> access associated with grabbing the lock. (cache-line bouncing)
I'm still trying to get my head around this for arm64, where we have the
following properties:
* Return to userspace is context-synchronizing
* We have a heavy barrier in switch_to
so it would seem to me that we could avoid taking RQ locks if the mm_cpumask
was kept up to date. The problematic case is where a CPU is not observed in
the mask (maybe the write is buffered), but it is running in userspace.
However, that can't occur with the barrier in switch_to.
So we only need to IPI those CPUs that were in userspace for this task
at the point when the syscall was made, and the mm_cpumask should reflect
that.
What am I missing?
Will
Powered by blists - more mailing lists