[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXTgNNwB4670QgV=ZLoubRrUATZJAiihUgLMuRUVN=baA@mail.gmail.com>
Date: Tue, 29 Aug 2017 22:01:56 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Boqun Feng <boqun.feng@...il.com>,
Andrew Hunter <ahh@...gle.com>,
maged michael <maged.michael@...il.com>,
gromer <gromer@...gle.com>, Avi Kivity <avi@...lladb.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
Dave Watson <davejwatson@...com>,
Andy Lutomirski <luto@...nel.org>,
Will Deacon <will.deacon@....com>,
Hans Boehm <hboehm@...gle.com>
Subject: Re: [PATCH v2] membarrier: provide register sync core cmd
> On Aug 27, 2017, at 8:05 PM, Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
>
> ----- On Aug 27, 2017, at 3:53 PM, Andy Lutomirski luto@...capital.net wrote:
>
>>> On Aug 27, 2017, at 1:50 PM, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
>>> wrote:
>>>
>>> Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
>>> system call. It allows processes to register their intent to have their
>>> threads issue core serializing barriers in addition to memory barriers
>>> whenever a membarrier command is performed.
>>>
>>
>> Why is this stateful? That is, why not just have a new membarrier command to
>> sync every thread's icache?
>
> If we'd do it on every CPU icache, it would be as trivial as you say. The
> concern here is sending IPIs only to CPUs running threads that belong to the
> same process, so we don't disturb unrelated processes.
>
> If we could just grab each CPU's runqueue lock, it would be fairly simple
> to do. But we want to avoid hitting each runqueue with exclusive atomic
> access associated with grabbing the lock. (cache-line bouncing)
Hmm. Are there really arches where there is no clean implementation
without this hacker? It seems rather unfortunate that munmap() can be
done efficiently but this barrier can't be.
At the very least, could there be a register command *and* a special
sync command? I dislike the idea that the sync command does something
different depending on some other state. Even better (IMO) would be a
design where you ask for an isync and, if the arch can do it
efficiently (x86), you get an efficient isync and, if the arch can't
(arm64?) you take all the rq locks?
--Andy
Powered by blists - more mailing lists