linux-kernel - Re: [PATCH 01/20] asm-generic/mmiowb: Add generic implementation of mmiowb() tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1551575210.6lwpiqtg5k.astroid@bobo.none>
Date:   Sun, 03 Mar 2019 11:43:09 +1000
From:   Nicholas Piggin <npiggin@...il.com>
To:     linux-arch@...r.kernel.org, Will Deacon <will.deacon@....com>
Cc:     Andrea Parri <andrea.parri@...rulasolutions.com>,
        Arnd Bergmann <arnd@...db.de>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Rich Felker <dalias@...c.org>,
        David Howells <dhowells@...hat.com>,
        Daniel Lustig <dlustig@...dia.com>,
        linux-kernel@...r.kernel.org,
        "Maciej W. Rozycki" <macro@...ux-mips.org>,
        Ingo Molnar <mingo@...nel.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Palmer Dabbelt <palmer@...ive.com>,
        Paul Burton <paul.burton@...s.com>,
        "Paul E. McKenney" <paulmck@...ux.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Alan Stern <stern@...land.harvard.edu>,
        Tony Luck <tony.luck@...el.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Yoshinori Sato <ysato@...rs.sourceforge.jp>
Subject: Re: [PATCH 01/20] asm-generic/mmiowb: Add generic implementation of
 mmiowb() tracking

Will Deacon's on March 2, 2019 12:03 am:
> In preparation for removing all explicit mmiowb() calls from driver
> code, implement a tracking system in asm-generic based loosely on the
> PowerPC implementation. This allows architectures with a non-empty
> mmiowb() definition to have the barrier automatically inserted in
> spin_unlock() following a critical section containing an I/O write.

Is there a reason to call this "mmiowb"? We already have wmb that
orders cacheable stores vs mmio stores don't we?

Yes ia64 "sn2" is broken in that case, but that can be fixed (if
anyone really cares about the platform any more). Maybe that's
orthogonal to what you're doing here, I just don't like seeing
"mmiowb" spread.

This series works for spin locks, but you would want a driver to
be able to use wmb() to order locks vs mmio when using a bit lock
or a mutex or whatever else. Calling your wmb-if-io-is-pending
version io_mb_before_unlock() would kind of match with existing
patterns.

> +static inline void mmiowb_set_pending(void)
> +{
> +	struct mmiowb_state *ms = __mmiowb_state();
> +	ms->mmiowb_pending = ms->nesting_count;
> +}
> +
> +static inline void mmiowb_spin_lock(void)
> +{
> +	struct mmiowb_state *ms = __mmiowb_state();
> +	ms->nesting_count++;
> +}
> +
> +static inline void mmiowb_spin_unlock(void)
> +{
> +	struct mmiowb_state *ms = __mmiowb_state();
> +
> +	if (unlikely(ms->mmiowb_pending)) {
> +		ms->mmiowb_pending = 0;
> +		mmiowb();
> +	}
> +
> +	ms->nesting_count--;
> +}

Humour me for a minute and tell me what this algorithm is doing, or
what was broken about the powerpc one, which is basically:

static inline void mmiowb_set_pending(void)
{
	struct mmiowb_state *ms = __mmiowb_state();
	ms->mmiowb_pending = 1;
}

static inline void mmiowb_spin_lock(void)
{
}

static inline void mmiowb_spin_unlock(void)
{
	struct mmiowb_state *ms = __mmiowb_state();

	if (unlikely(ms->mmiowb_pending)) {
		ms->mmiowb_pending = 0;
		mmiowb();
	}
}

> diff --git a/include/asm-generic/mmiowb_types.h b/include/asm-generic/mmiowb_types.h
> new file mode 100644
> index 000000000000..8eb0095655e7
> --- /dev/null
> +++ b/include/asm-generic/mmiowb_types.h
> @@ -0,0 +1,12 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_GENERIC_MMIOWB_TYPES_H
> +#define __ASM_GENERIC_MMIOWB_TYPES_H
> +
> +#include <linux/types.h>
> +
> +struct mmiowb_state {
> +	u16	nesting_count;
> +	u16	mmiowb_pending;
> +};

Really need more than 255 nested spin locks? I had the idea that 16
bit operations were a bit more costly than 8 bit on some CPUs... may
not be true, but at least the smaller size packs a bit better on
powerpc.

Thanks,
Nick