lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251009232008.GM1386988@noisy.programming.kicks-ass.net>
Date: Fri, 10 Oct 2025 01:20:08 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Oleg Nesterov <oleg@...hat.com>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Boqun Feng <boqun.feng@...il.com>,
	David Howells <dhowells@...hat.com>, Ingo Molnar <mingo@...hat.com>,
	Li RongQing <lirongqing@...du.com>,
	Waiman Long <longman@...hat.com>, Will Deacon <will@...nel.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/4] seqlock: introduce scoped_seqlock_read() and
 scoped_seqlock_read_irqsave()

On Fri, Oct 10, 2025 at 12:12:42AM +0200, Peter Zijlstra wrote:
> On Thu, Oct 09, 2025 at 01:24:51PM -0700, Linus Torvalds wrote:
> > On Thu, 9 Oct 2025 at 13:12, Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > Slightly nicer version that's actually compiled :-)
> > 
> > I assume that "target of ss_lockless" is an intentional extension to
> > just make the loop never take a lock at all?
> 
> Yep. Almost came for free, so might as well do it.
> 
> > I do like it, but I think you'll find that having a separate 'seq' and
> > 'flags' like this:
> > 
> > > +struct ss_tmp {
> > > +       enum ss_state   state;
> > > +       int             seq;
> > > +       unsigned long   flags;
> > > +       spinlock_t      *lock;
> > > +};
> > 
> > makes it unnecessarily waste a register.
> > 
> > You never need both seq and flags at the same time, since if you take
> > the spinlock the sequence number is pointless.
> > 
> > So please make that a union, and I think it will help avoid wasting a
> > register in the loop.
> 
> Sure; otoh compiler should be able to tell the same using liveness
> analysis I suppose, but perhaps they're not *that* clever.

The moment I use a union gcc-14 creates the whole structure on the
stack, while without the union it mostly manages to keep the things
in registers without too much spilling.

Also, there's a rather silly bug in the code, the __cleanup function
looks at ->state to determine which spin_unlock variant to pick, but
->state will be ss_done. Additionally, when someone does 'break' or
'goto' out of the scope the state isn't reliably anyway.

The sanest code gen happens when I do:

struct ss_tmp {
	enum ss_state	state;
	int		seq;
	unsigned long	flags;
	spinlock_t	*lock;
	spinlock_t	*lock_irqsave;
}

static inline void __scoped_seqlock_cleanup(struct ss_tmp *sst)
{
	if (sst->lock)
		spin_unlock(sst->lock);
	if (sst->lock_irqsave)
		spin_unlock_irqrestore(sst->lock, sst->flags);
}

liveness analysis seems to work well, and it is able to determine which
fields are active and only track those in registers and utterly discard
the rest. Specifically in the thread_group_cputime code, there are only
_raw_spin_lock_irqsave and _raw_spin_unlock_irqrestore calls left, all
the other stuff is just optimized out.

I'll look what clang does ... but that's really tomorrow, or rather,
somewhat later this morning :/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ