lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 7 Sep 2016 13:31:02 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Matt Redfearn <matt.redfearn@...tec.com>
Cc:     Ralf Baechle <ralf@...ux-mips.org>, linux-mips@...ux-mips.org,
        Arnd Bergmann <arnd@...db.de>, linux-kernel@...r.kernel.org,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Paul Burton <paul.burton@...tec.com>,
        Will Deacon <will.deacon@....com>
Subject: Re: [PATCH v2 05/12] MIPS: Barrier: Add definitions of SYNC stype
 values

On Wed, Sep 07, 2016 at 01:24:23PM +0200, Peter Zijlstra wrote:
> > +/*
> > + * Ordering barriers:
> > + * - Every synchronizable specified memory instruction (loads or stores or both)
> > + *   that occurs in the instruction stream before the SYNC instruction must
> > + *   reach a stage in the load/store datapath after which no instruction
> > + *   re-ordering is possible before any synchronizable specified memory
> > + *   instruction which occurs after the SYNC instruction in the instruction
> > + *   stream reaches the same stage in the load/store datapath.
> > + *
> > + * - If any memory instruction before the SYNC instruction in program order,
> > + *   generates a memory request to the external memory and any memory
> > + *   instruction after the SYNC instruction in program order also generates a
> > + *   memory request to external memory, the memory request belonging to the
> > + *   older instruction must be globally performed before the time the memory
> > + *   request belonging to the younger instruction is globally performed.
> > + *
> > + * - The barrier does not guarantee the order in which instruction fetches are
> > + *   performed.
> > + */
> > +
> > +/*
> > + * stype 0x10 - An ordering barrier that affects preceding loads and stores and
> > + * subsequent loads and stores.
> > + * Older instructions which must reach the load/store ordering point before the
> > + * SYNC instruction completes: Loads, Stores
> > + * Younger instructions which must reach the load/store ordering point only
> > + * after the SYNC instruction completes: Loads, Stores
> > + * Older instructions which must be globally performed when the SYNC instruction
> > + * completes: N/A
> > + */
> > +#define STYPE_SYNC_MB 0x10
> 
> This I'm not sure of; it states that things must become globally visible
> in the order specified, but the wording leaves a fairly big hole. It
> doesn't state that things cannot be less than globally visible at
> intermediate times.
> 
> To take the example from Documentation/memory-barriers.txt:
> 
>         CPU 1                   CPU 2                   CPU 3
>         ======================= ======================= =======================
>                 { X = 0, Y = 0 }
>         STORE X=1               LOAD X                  STORE Y=1
>                                 <general barrier>       <general barrier>
>                                 LOAD Y                  LOAD X
> 
> Suppose that CPU 2's load from X returns 1 and its load from Y returns 0.
> This indicates that CPU 2's load from X in some sense follows CPU 1's
> store to X and that CPU 2's load from Y in some sense preceded CPU 3's
> store to Y.  The question is then "Can CPU 3's load from X return 0?"
> 
> 
> Is it ever possible for CPU2 and CPU3 to match "SYNC 10" points but to
> disagree on their loads of X?
> 
> That is, even though CPU2 and CPU3 agree on their respective past and
> future stores, the 'happens before' relation CPU1 and CPU2 have wrt. X
> is not included?
> 

Now, I suspect it _is_ transitive, because CPU2's "LOAD X" must be
globally performed wrt CPU3's "LOAD X", and my interpretation of that
means that the STORE of X must be globally visible for that to be true.

But, like said, wording... so clarification would be grand.

Also, IFF "SYNC 10" is indeed transitive, you should be able to replace
smp_mb() with it unconditionally.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ