[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b7ccb5b8-4f19-4cfc-a94b-02bca7c4561f@paulmck-laptop>
Date: Fri, 20 Dec 2024 09:40:32 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Will Deacon <will@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Daniel Xu <dxu@...uu.xyz>,
mingo@...hat.com, longman@...hat.com, boqun.feng@...il.com,
linux-kernel@...r.kernel.org, linux-toolchains@...r.kernel.org
Subject: Re: [PATCH] seqlock: Use WRITE_ONCE() when updating sequence
On Thu, Dec 19, 2024 at 06:31:15PM +0000, Will Deacon wrote:
> On Thu, Dec 19, 2024 at 10:06:23AM -0800, Paul E. McKenney wrote:
> > On Thu, Dec 19, 2024 at 05:58:03PM +0000, Will Deacon wrote:
> > > On Thu, Dec 19, 2024 at 09:53:48AM -0800, Paul E. McKenney wrote:
> > > > On Thu, Dec 19, 2024 at 04:34:55PM +0000, Will Deacon wrote:
> > > > > On Wed, Dec 18, 2024 at 06:12:41PM +0100, Peter Zijlstra wrote:
> > > > > > diff --git a/arch/s390/kernel/idle.c b/arch/s390/kernel/idle.c
> > > > > > index 39cb8d0ae348..8fb7cd75fe62 100644
> > > > > > --- a/arch/s390/kernel/idle.c
> > > > > > +++ b/arch/s390/kernel/idle.c
> > > > > > @@ -45,7 +45,7 @@ void account_idle_time_irq(void)
> > > > > >
> > > > > > /* Account time spent with enabled wait psw loaded as idle time. */
> > > > > > WRITE_ONCE(idle->idle_time, READ_ONCE(idle->idle_time) + idle_time);
> > > > > > - WRITE_ONCE(idle->idle_count, READ_ONCE(idle->idle_count) + 1);
> > > > > > + INC_ONC(idle->idle_count);
> > > > >
> > > > > (nit: drive-by typo)
> > > >
> > > > Heh! I was clearly building with GCC. ;-)
> > >
> > > And not for S390 :p
> >
> > True enough! ;-)
> >
> > > > > > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> > > > > > index c9b58188ec61..77c253e29758 100644
> > > > > > --- a/include/linux/compiler-gcc.h
> > > > > > +++ b/include/linux/compiler-gcc.h
> > > > > > @@ -137,3 +137,8 @@
> > > > > > #if GCC_VERSION < 90100
> > > > > > #undef __alloc_size__
> > > > > > #endif
> > > > > > +
> > > > > > +/*
> > > > > > + * GCC can't properly optimize the real one with volatile on.
> > > > > > + */
> > > > > > +#define INC_ONCE(v) (v)++
> > > > >
> > > > > I think I'm missing what we're trying to do here, but how is this a
> > > > > safe fallback? I'd have thought we'd need the whole READ_ONCE() +
> > > > > WRITE_ONCE() sequence if the compiler doesn't give us what we need...
> > > > >
> > > > > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > > > > > index efd43df3a99a..b1b13dac1b9e 100644
> > > > > > --- a/include/linux/compiler.h
> > > > > > +++ b/include/linux/compiler.h
> > > > > > @@ -8,6 +8,10 @@
> > > > > >
> > > > > > #ifdef __KERNEL__
> > > > > >
> > > > > > +#ifndef INC_ONCE
> > > > > > +#define INC_ONCE(v) (*(volatile typeof(v) *)&(v))++
> > > > > > +#endif
> > > > >
> > > > > ... but I'm honestly not sure what this is aiming to elide. Is this about
> > > > > having the compiler emit an increment instruction with memory operands
> > > > > on architectures that support it? If so, what about architectures that
> > > > > _don't_ support that? We still need to guarantee that the load/store
> > > > > instructions are single-copy atomic for the type in that case.
> > > >
> > > > Architectures whose increment instructions lack memory operands can still
> > > > load, increment, then store. So yes, if two of these run concurrently,
> > > > you can lose counts when incrementing normal memory. (Of course, with
> > > > device memory, you get what you get.)
> > >
> > > ... but can we guarantee that those load and store instructions won't
> > > now be torn?
> >
> > The volatile type qualifier must take care of that, or device drivers
> > no longer work.
> >
> > Or are you worried about the C-language ++ fallback? In that case,
> > as I understand it, this fallback is only there until GCC learns to
> > generate a increment-to-memory instruction for x86.
>
> As I understand it, both the clang and the GCC implementations in the
> proposed diff are written using the C-language ++ operator, no?
>
> The GCC one doesn't have 'volatile' at all:
>
> #define INC_ONCE(v) (v)++
>
> so I don't understand why it's a correct replacement for e.g.:
>
> WRITE_ONCE(idle->idle_time, READ_ONCE(idle->idle_time) + idle_time);
>
> If it was, then we wouldn't need any of these macros in the first place!
You are right, it is not a correct replacement. It is instead a
behavior-preserving fallback until GCC does the optimizations.
Here the behavior is the code generation.
> The clang version does have 'volatile':
>
> #define INC_ONCE(v) (*(volatile typeof(v) *)&(v))++
>
> but then I don't understand why this is enough for the compiler to
> generate an 'inc' with memory operands on x86 if it cannot do that for
> the READ_ONCE + WRITE_ONCE implementation. If somehow the above is more
> "optimisable" then I worry about the atomicity being preserved as well.
>
> So, basically, I don't understand what's going on here :)
Peter learned that clang does in fact optimize the original READ_ONCE +
WRITE_ONCE implementation.
> I think I was expecting to see arch-specific implementations of
> INC_ONCE() written in assembly where they have the relevant instructions
> but that doesn't seem to be the case.
That is yet another fallback, though with quite a bit more code. :-(
Thanx, Paul
Powered by blists - more mailing lists