lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a3xGQypztPBkms6WgxPmOM4ERFanJa1xq8qFxfYYcY9RQ@mail.gmail.com>
Date:   Wed, 12 Jul 2017 15:13:15 +0200
From:   Arnd Bergmann <arnd@...db.de>
To:     Palmer Dabbelt <palmer@...belt.com>
Cc:     Olof Johansson <olof@...om.net>,
        Andrew Morton <akpm@...ux-foundation.org>, albert@...ive.com,
        Masahiro Yamada <yamada.masahiro@...ionext.com>,
        Michal Marek <mmarek@...e.com>,
        Will Deacon <will.deacon@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Boqun Feng <boqun.feng@...il.com>,
        Ingo Molnar <mingo@...hat.com>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Jason Cooper <jason@...edaemon.net>,
        Marc Zyngier <marc.zyngier@....com>,
        gregkh <gregkh@...uxfoundation.org>,
        Jiri Slaby <jslaby@...e.com>,
        David Miller <davem@...emloft.net>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        Stephen Rothwell <sfr@...b.auug.org.au>,
        Frédéric Weisbecker <fweisbec@...il.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        "Luis R. Rodriguez" <mcgrof@...nel.org>,
        Doug Ledford <dledford@...hat.com>,
        Bart Van Assche <bart.vanassche@...disk.com>,
        sstabellini@...nel.org, Daniel Vetter <daniel.vetter@...ll.ch>,
        Michael Ellerman <mpe@...erman.id.au>,
        Mark Salter <msalter@...hat.com>,
        Nicolas Dichtel <nicolas.dichtel@...nd.com>,
        James Hogan <james.hogan@...tec.com>,
        Paul Gortmaker <paul.gortmaker@...driver.com>,
        Guenter Roeck <linux@...ck-us.net>,
        Heiko Carstens <heiko.carstens@...ibm.com>,
        Martin Schwidefsky <schwidefsky@...ibm.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        patches@...ups.riscv.org
Subject: Re: [PATCH 10/17] RISC-V: Atomic and Locking Code

On Wed, Jul 12, 2017 at 3:31 AM, Palmer Dabbelt <palmer@...belt.com> wrote:

> +/*
> + * FIXME: I'm flip-flopping on whether or not we should keep this or enforce
> + * the ordering with I/O on spinlocks.  The worry is that drivers won't get
> + * this correct, but I also don't want to introduce a fence into the lock code
> + * that otherwise only uses AMOs and LR/SC.   For now I'm leaving this here:
> + * "w,o" is sufficient to ensure that all writes to the device has completed
> + * before the write to the spinlock is allowed to commit.  I surmised this from
> + * reading "ACQUIRES VS I/O ACCESSES" in memory-barriers.txt.
> + */
> +#define mmiowb()       __asm__ __volatile__ ("fence o,w" : : : "memory");

Not sure if this has been discussed before, but have you considered doing
it like powerpc with a per-cpu flag that gets checked by spin_unlock()?



> +/*
> + * Relaxed I/O memory access primitives. These follow the Device memory
> + * ordering rules but do not guarantee any ordering relative to Normal memory
> + * accesses.  The memory barriers here are necessary as RISC-V doesn't define
> + * any ordering constraints on accesses to the device I/O space.  These are
> + * defined to order the indicated access (either a read or write) with all
> + * other I/O memory accesses.
> + */
> +/*
> + * FIXME: The platform spec will define the default Linux-capable platform to
> + * have some extra IO ordering constraints that will make these fences
> + * unnecessary.
> + */
> +#define __iorrmb()     __asm__ __volatile__ ("fence i,io" : : : "memory");
> +#define __iorwmb()     __asm__ __volatile__ ("fence io,o" : : : "memory");
> +
> +#define readb_relaxed(c)       ({ u8  __v = readb_cpu(c); __iorrmb(); __v; })
> +#define readw_relaxed(c)       ({ u16 __v = readw_cpu(c); __iorrmb(); __v; })
> +#define readl_relaxed(c)       ({ u32 __v = readl_cpu(c); __iorrmb(); __v; })
> +#define readq_relaxed(c)       ({ u64 __v = readq_cpu(c); __iorrmb(); __v; })
> +
> +#define writeb_relaxed(v,c)    ({ __iorwmb(); writeb_cpu((v),(c)); })
> +#define writew_relaxed(v,c)    ({ __iorwmb(); writew_cpu((v),(c)); })
> +#define writel_relaxed(v,c)    ({ __iorwmb(); writel_cpu((v),(c)); })
> +#define writeq_relaxed(v,c)    ({ __iorwmb(); writeq_cpu((v),(c)); })

What are the ordering rules for a write followed by a read? You say
that there are no ordering constraints on I/O access at all, but you don't
add any barriers between that pair, which is probably the most important
to get right.

Also, can you say here why the architecture is defined like this?
I'm sure that there have been very careful considerations on what
the semantics should be, yet it appears that MMIO access is defined
to be so relaxed purely as an optimization that as a consequence
leads to much slower I/O than a more typical definition as the operating
system has to add much heavier barriers than would otherwise be
needed.

> +/*
> + * I/O memory access primitives. Reads are ordered relative to any
> + * following Normal memory access. Writes are ordered relative to any prior
> + * Normal memory access.  The memory barriers here are necessary as RISC-V
> + * doesn't define any ordering between the memory space and the I/O space.
> + * They may be stronger than necessary ("i,r" and "w,o" might be sufficient),
> + * but I feel kind of queasy making these weaker in any manner than the relaxed
> + * versions above.
> + */
> +#define __iormb()      __asm__ __volatile__ ("fence i,ior" : : : "memory");
> +#define __iowmb()      __asm__ __volatile__ ("fence iow,o" : : : "memory");
> +
> +#define readb(c)               ({ u8  __v = readb_cpu(c); __iormb(); __v; })
> +#define readw(c)               ({ u16 __v = readw_cpu(c); __iormb(); __v; })
> +#define readl(c)               ({ u32 __v = readl_cpu(c); __iormb(); __v; })
> +#define readq(c)               ({ u64 __v = readq_cpu(c); __iormb(); __v; })
> +
> +#define writeb(v,c)            ({ __iowmb(); writeb_cpu((v),(c)); })
> +#define writew(v,c)            ({ __iowmb(); writew_cpu((v),(c)); })
> +#define writel(v,c)            ({ __iowmb(); writel_cpu((v),(c)); })
> +#define writeq(v,c)            ({ __iowmb(); writeq_cpu((v),(c)); })

These look good to me (once the _relaxed version has been clarified, the
same question applies here as well). I see that you don't override the
PCI I/O space accessors (inb/outb etc), which have even stricter
ordering requirements.
I think you also need a barrier after outb/outw/outl to ensure that the write
has completed on the bus before anything else happens.

Finally, I think you also need to override the string functions
(readsl/writesl/insl/outsl) that are lacking barriers in the asm-generic
version.

         Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ