linux-kernel - Re: [PATCH] seqlock: Use WRITE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7f7ac0e2-8103-426e-aade-5a37246e8a98@paulmck-laptop>
Date: Thu, 19 Dec 2024 09:53:48 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Will Deacon <will@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Daniel Xu <dxu@...uu.xyz>,
	mingo@...hat.com, longman@...hat.com, boqun.feng@...il.com,
	linux-kernel@...r.kernel.org, linux-toolchains@...r.kernel.org
Subject: Re: [PATCH] seqlock: Use WRITE_ONCE() when updating sequence

On Thu, Dec 19, 2024 at 04:34:55PM +0000, Will Deacon wrote:
> On Wed, Dec 18, 2024 at 06:12:41PM +0100, Peter Zijlstra wrote:
> > 
> > +linux-toolchains
> > 
> > On Wed, Dec 18, 2024 at 08:59:47AM -0800, Paul E. McKenney wrote:
> > 
> > > > Perhaps something like: (*(volatile unsigned int *)&s->sequence)++; ?
> > > > I'd have to check what the compiler makes of that.
> > > > 
> > > > /me mucks about with godbolt for a bit...
> > > > 
> > > > GCC doesn't optimize that, but Clang does.
> > > > 
> > > > I would still very much refrain from making this change until both
> > > > compilers can generate sane code for it.
> > > 
> > > Is GCC on track to do this, or do we need to encourage them?
> > 
> > I have no clue; probably wise to offer encouragement.
> > 
> > > Just to make sure I understand, your proposal is to create an INC_ONCE()
> > > or similar, and add it once compiler support is there?  Seems reasonable
> > > to me, just checking.
> > 
> > I suppose we can work in parallel, add INC_ONCE() now, but not have a
> > proper definition for GCC.
> 
> [...]
> 
> > diff --git a/arch/s390/kernel/idle.c b/arch/s390/kernel/idle.c
> > index 39cb8d0ae348..8fb7cd75fe62 100644
> > --- a/arch/s390/kernel/idle.c
> > +++ b/arch/s390/kernel/idle.c
> > @@ -45,7 +45,7 @@ void account_idle_time_irq(void)
> >  
> >  	/* Account time spent with enabled wait psw loaded as idle time. */
> >  	WRITE_ONCE(idle->idle_time, READ_ONCE(idle->idle_time) + idle_time);
> > -	WRITE_ONCE(idle->idle_count, READ_ONCE(idle->idle_count) + 1);
> > +	INC_ONC(idle->idle_count);
> 
> (nit: drive-by typo)

Heh!  I was clearly building with GCC.  ;-)

> [...]
> 
> > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> > index c9b58188ec61..77c253e29758 100644
> > --- a/include/linux/compiler-gcc.h
> > +++ b/include/linux/compiler-gcc.h
> > @@ -137,3 +137,8 @@
> >  #if GCC_VERSION < 90100
> >  #undef __alloc_size__
> >  #endif
> > +
> > +/*
> > + * GCC can't properly optimize the real one with volatile on.
> > + */
> > +#define INC_ONCE(v) (v)++
> 
> I think I'm missing what we're trying to do here, but how is this a
> safe fallback? I'd have thought we'd need the whole READ_ONCE() +
> WRITE_ONCE() sequence if the compiler doesn't give us what we need...
> 
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index efd43df3a99a..b1b13dac1b9e 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -8,6 +8,10 @@
> >  
> >  #ifdef __KERNEL__
> >  
> > +#ifndef INC_ONCE
> > +#define INC_ONCE(v)	(*(volatile typeof(v) *)&(v))++
> > +#endif
> 
> ... but I'm honestly not sure what this is aiming to elide. Is this about
> having the compiler emit an increment instruction with memory operands
> on architectures that support it? If so, what about architectures that
> _don't_ support that? We still need to guarantee that the load/store
> instructions are single-copy atomic for the type in that case.

Architectures whose increment instructions lack memory operands can still
load, increment, then store.  So yes, if two of these run concurrently,
you can lose counts when incrementing normal memory.  (Of course, with
device memory, you get what you get.)

> Given how hard it can be to get the compiler to break atomicity for
> plain accesses, I'm pretty worried about the robustness of this.

Your point being that the current plain C-language postfix ++ is unlikely
to be adversely optimized?  If so, although I agree here in 2024,
the years pass quickly and increasingly clever compiler optimizations
accumulate.

							Thanx, Paul