[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1551241306.u5r150hwwb.astroid@bobo.none>
Date: Wed, 27 Feb 2019 14:36:05 +1000
From: Nicholas Piggin <npiggin@...il.com>
To: linux-arch@...r.kernel.org, Will Deacon <will.deacon@....com>
Cc: Andrea Parri <andrea.parri@...rulasolutions.com>,
Arnd Bergmann <arnd@...db.de>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Rich Felker <dalias@...c.org>,
David Howells <dhowells@...hat.com>,
Daniel Lustig <dlustig@...dia.com>,
linux-kernel@...r.kernel.org,
"Maciej W. Rozycki" <macro@...ux-mips.org>,
Ingo Molnar <mingo@...nel.org>,
Michael Ellerman <mpe@...erman.id.au>,
Palmer Dabbelt <palmer@...ive.com>,
Paul Burton <paul.burton@...s.com>,
"Paul E. McKenney" <paulmck@...ux.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
Alan Stern <stern@...land.harvard.edu>,
Tony Luck <tony.luck@...el.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Yoshinori Sato <ysato@...rs.sourceforge.jp>
Subject: Re: [RFC PATCH 11/20] ia64: Add unconditional mmiowb() to
arch_spin_unlock()
Will Deacon's on February 23, 2019 4:50 am:
> The mmiowb() macro is horribly difficult to use and drivers will continue
> to work most of the time if they omit a call when it is required.
>
> Rather than rely on driver authors getting this right, push mmiowb() into
> arch_spin_unlock() for ia64. If this is deemed to be a performance issue,
> a subsequent optimisation could make use of ARCH_HAS_MMIOWB to elide
> the barrier in cases where no I/O writes were performned inside the
> critical section.
mmiowb() was always the wrong approach. IIRC what happened is that an
ia64 platform found that real wmb() semantics were too expensive, so
they kind of "relaxed" it, breaking everything, and then said drivers
that wanted to unbreak themselves had to add these mmiowb() in.
The right way to go of course would have been to implement wmb()
the way existing drivers expected, and add a faster io_wmb() that
only ordered mmio stores from the CPU added to the few drivers that
the platform cared about.
I think it was argued the wmb() was still technically correct because
the reordering did not happen at the CPU, but somewhere else in the
interconnect or PCI controller. But that was just a crazy burden to
put on driver writers, and it was why the documentation was always
incomprehensible.
Not sure why Linus ever went along with it, but awesome you're removing
it. Thank you!
Thanks,
Nick
Powered by blists - more mailing lists