linux-kernel - [RFC 2/1] seqlock: make the read_seqbegin_or_lock() API more simple and less error-prone ?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250928162054.GB3121@redhat.com>
Date: Sun, 28 Sep 2025 18:20:54 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Boqun Feng <boqun.feng@...il.com>, David Howells <dhowells@...hat.com>,
	Ingo Molnar <mingo@...hat.com>, Li RongQing <lirongqing@...du.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Waiman Long <longman@...hat.com>, Will Deacon <will@...nel.org>
Cc: linux-kernel@...r.kernel.org
Subject: [RFC 2/1] seqlock: make the read_seqbegin_or_lock() API more simple
 and less error-prone ?

Another problem is that this API is error prone. Two years ago half of the
read_seqbegin_or_lock() users were buggy (they followed the wrong example
from Documentation/locking/seqlock.rst). And even if the code is mostly
correct it is very easy to add a hard-to-detect mistake, see for example

	[PATCH][v3] afs: Remove erroneous seq |= 1 in volume lookup loop
	https://lore.kernel.org/all/20250910084235.2630-1-lirongqing@baidu.com/

Can we improve this API?

-------------------------------------------------------------------------------
To simplify, suppose we add the new helper

	static inline int need_seqretry_xxx(seqlock_t *lock, int *seq)
	{
		int ret = !(*seq & 1) && read_seqretry(lock, *seq);

		if (ret)
			++*seq;	/* make this counter odd */

		return ret;
	}

which can be used instead of need_seqretry(). This way the users do not
need to modify "seq" in the main loop. For example, we can simplify
thread_group_cputime() as follows:

	--- a/kernel/sched/cputime.c
	+++ b/kernel/sched/cputime.c
	@@ -314,7 +314,7 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
		struct signal_struct *sig = tsk->signal;
		u64 utime, stime;
		struct task_struct *t;
	-	unsigned int seq, nextseq;
	+	unsigned int seq;
		unsigned long flags;
	 
		/*
	@@ -330,9 +330,8 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
	 
		rcu_read_lock();
		/* Attempt a lockless read on the first round. */
	-	nextseq = 0;
	+	seq = 0;
		do {
	-		seq = nextseq;
			flags = read_seqbegin_or_lock_irqsave(&sig->stats_lock, &seq);
			times->utime = sig->utime;
			times->stime = sig->stime;
	@@ -344,9 +343,7 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
				times->stime += stime;
				times->sum_exec_runtime += read_sum_exec_runtime(t);
			}
	-		/* If lockless access failed, take the lock. */
	-		nextseq = 1;
	-	} while (need_seqretry(&sig->stats_lock, seq));
	+	} while (need_seqretry_xxx(&sig->stats_lock, &seq));
		done_seqretry_irqrestore(&sig->stats_lock, seq, flags);
		rcu_read_unlock();
	 }

most (if not all) of other users can be changed the same way.

-------------------------------------------------------------------------------
Or perhaps we can even add a helper that hides all the details, something like

	int xxx(seqlock_t *lock, int *seq, int lockless)
	{
		if (lockless) {
			*seq = read_seqbegin(lock);
			return 1;
		}

		if (*seq & 1) {
			read_sequnlock_excl(lock);
			return 0;
		}

		if (read_seqretry(lock, *seq)) {
			read_seqlock_excl(lock);
			*seq = 1;
			return 1;
		}

		return 0;

	}

	#define __XXX(lock, seq, lockless)	\
		for (int lockless = 1, seq; xxx(lock, &seq, lockless); lockless = 0)

	#define XXX(lock)	\
		__XXX(lock, __UNIQUE_ID(seq), __UNIQUE_ID(lockless))


?

This way the users can simply do:

	seqlock_t sl;

	void func(void)
	{
		XXX(&sl) {
			... read-side critical section ...
		}
	}

using only the new XXX() helper. No need to declare/initialize seq, no need
for need_seqretry/done_seqretry.

What do you think?

Oleg.