lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 29 Aug 2017 22:01:56 -0700
From:   Andy Lutomirski <>
To:     Mathieu Desnoyers <>
Cc:     "Paul E. McKenney" <>,
        Peter Zijlstra <>,
        linux-kernel <>,
        Boqun Feng <>,
        Andrew Hunter <>,
        maged michael <>,
        gromer <>, Avi Kivity <>,
        Benjamin Herrenschmidt <>,
        Paul Mackerras <>,
        Michael Ellerman <>,
        Dave Watson <>,
        Andy Lutomirski <>,
        Will Deacon <>,
        Hans Boehm <>
Subject: Re: [PATCH v2] membarrier: provide register sync core cmd

> On Aug 27, 2017, at 8:05 PM, Mathieu Desnoyers <> wrote:
> ----- On Aug 27, 2017, at 3:53 PM, Andy Lutomirski wrote:
>>> On Aug 27, 2017, at 1:50 PM, Mathieu Desnoyers <>
>>> wrote:
>>> Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
>>> system call. It allows processes to register their intent to have their
>>> threads issue core serializing barriers in addition to memory barriers
>>> whenever a membarrier command is performed.
>> Why is this stateful?  That is, why not just have a new membarrier command to
>> sync every thread's icache?
> If we'd do it on every CPU icache, it would be as trivial as you say. The
> concern here is sending IPIs only to CPUs running threads that belong to the
> same process, so we don't disturb unrelated processes.
> If we could just grab each CPU's runqueue lock, it would be fairly simple
> to do. But we want to avoid hitting each runqueue with exclusive atomic
> access associated with grabbing the lock. (cache-line bouncing)

Hmm.  Are there really arches where there is no clean implementation
without this hacker?  It seems rather unfortunate that munmap() can be
done efficiently but this barrier can't be.

At the very least, could there be a register command *and* a special
sync command?  I dislike the idea that the sync command does something
different depending on some other state.  Even better (IMO) would be a
design where you ask for an isync and, if the arch can do it
efficiently (x86), you get an efficient isync and, if the arch can't
(arm64?) you take all the rq locks?


Powered by blists - more mailing lists