[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190226182624.GC28709@fuggles.cambridge.arm.com>
Date: Tue, 26 Feb 2019 18:26:24 +0000
From: Will Deacon <will.deacon@....com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-arch <linux-arch@...r.kernel.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
"Paul E. McKenney" <paulmck@...ux.ibm.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Michael Ellerman <mpe@...erman.id.au>,
Arnd Bergmann <arnd@...db.de>,
Peter Zijlstra <peterz@...radead.org>,
Andrea Parri <andrea.parri@...rulasolutions.com>,
Palmer Dabbelt <palmer@...ive.com>,
Daniel Lustig <dlustig@...dia.com>,
David Howells <dhowells@...hat.com>,
Alan Stern <stern@...land.harvard.edu>,
"Maciej W. Rozycki" <macro@...ux-mips.org>,
Paul Burton <paul.burton@...s.com>,
Ingo Molnar <mingo@...nel.org>,
Yoshinori Sato <ysato@...rs.sourceforge.jp>,
Rich Felker <dalias@...c.org>, Tony Luck <tony.luck@...el.com>
Subject: Re: [RFC PATCH 01/20] asm-generic/mmiowb: Add generic implementation
of mmiowb() tracking
Hi Linus,
Thanks for having a look.
On Fri, Feb 22, 2019 at 01:49:32PM -0800, Linus Torvalds wrote:
> On Fri, Feb 22, 2019 at 10:50 AM Will Deacon <will.deacon@....com> wrote:
> >
> > +#ifndef mmiowb_set_pending
> > +static inline void mmiowb_set_pending(void)
> > +{
> > + __this_cpu_write(__mmiowb_state.mmiowb_pending, 1);
> > +}
> > +#endif
> > +
> > +#ifndef mmiowb_spin_lock
> > +static inline void mmiowb_spin_lock(void)
> > +{
> > + if (__this_cpu_inc_return(__mmiowb_state.nesting_count) == 1)
> > + __this_cpu_write(__mmiowb_state.mmiowb_pending, 0);
> > +}
> > +#endif
>
> The case we want to go fast is the spin-lock and unlock case, not the
> "set pending" case.
>
> And the way you implemented this, it's exactly the wrong way around.
>
> So I'd suggest instead doing
>
> static inline void mmiowb_set_pending(void)
> {
> __this_cpu_write(__mmiowb_state.mmiowb_pending,
> __mmiowb_state.nesting_count);
> }
>
> and
>
> static inline void mmiowb_spin_lock(void)
> {
> __this_cpu_inc(__mmiowb_state.nesting_count);
> }
>
> which makes that spin-lock code much simpler and avoids the conditional there.
Makes sense; I'll hook that up for the next version.
> Then the unlock case could be something like
>
> static inline void mmiowb_spin_unlock(void)
> {
> if (unlikely(__this_cpu_read(__mmiowb_state.mmiowb_pending))) {
> __this_cpu_write(__mmiowb_state.mmiowb_pending, 0);
> mmiowb();
> }
> __this_cpu_dec(__mmiowb_state.nesting_count);
> }
>
> or something (xchg is generally much more expensive than read, and the
> common case for spinlocks is that nobody did IO inside of it).
So I *am* using __this_cpu_xchg() here, which means the architecture can
get away with plain old loads and stores (which is what RISC-V does, for
example), but I see that's not the case on e.g. x86 so I'll rework using
read() and write() because it doesn't hurt.
Will
Powered by blists - more mailing lists