[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250928162054.GB3121@redhat.com>
Date: Sun, 28 Sep 2025 18:20:54 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Boqun Feng <boqun.feng@...il.com>, David Howells <dhowells@...hat.com>,
Ingo Molnar <mingo@...hat.com>, Li RongQing <lirongqing@...du.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Waiman Long <longman@...hat.com>, Will Deacon <will@...nel.org>
Cc: linux-kernel@...r.kernel.org
Subject: [RFC 2/1] seqlock: make the read_seqbegin_or_lock() API more simple
and less error-prone ?
Another problem is that this API is error prone. Two years ago half of the
read_seqbegin_or_lock() users were buggy (they followed the wrong example
from Documentation/locking/seqlock.rst). And even if the code is mostly
correct it is very easy to add a hard-to-detect mistake, see for example
[PATCH][v3] afs: Remove erroneous seq |= 1 in volume lookup loop
https://lore.kernel.org/all/20250910084235.2630-1-lirongqing@baidu.com/
Can we improve this API?
-------------------------------------------------------------------------------
To simplify, suppose we add the new helper
static inline int need_seqretry_xxx(seqlock_t *lock, int *seq)
{
int ret = !(*seq & 1) && read_seqretry(lock, *seq);
if (ret)
++*seq; /* make this counter odd */
return ret;
}
which can be used instead of need_seqretry(). This way the users do not
need to modify "seq" in the main loop. For example, we can simplify
thread_group_cputime() as follows:
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -314,7 +314,7 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
struct signal_struct *sig = tsk->signal;
u64 utime, stime;
struct task_struct *t;
- unsigned int seq, nextseq;
+ unsigned int seq;
unsigned long flags;
/*
@@ -330,9 +330,8 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
rcu_read_lock();
/* Attempt a lockless read on the first round. */
- nextseq = 0;
+ seq = 0;
do {
- seq = nextseq;
flags = read_seqbegin_or_lock_irqsave(&sig->stats_lock, &seq);
times->utime = sig->utime;
times->stime = sig->stime;
@@ -344,9 +343,7 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
times->stime += stime;
times->sum_exec_runtime += read_sum_exec_runtime(t);
}
- /* If lockless access failed, take the lock. */
- nextseq = 1;
- } while (need_seqretry(&sig->stats_lock, seq));
+ } while (need_seqretry_xxx(&sig->stats_lock, &seq));
done_seqretry_irqrestore(&sig->stats_lock, seq, flags);
rcu_read_unlock();
}
most (if not all) of other users can be changed the same way.
-------------------------------------------------------------------------------
Or perhaps we can even add a helper that hides all the details, something like
int xxx(seqlock_t *lock, int *seq, int lockless)
{
if (lockless) {
*seq = read_seqbegin(lock);
return 1;
}
if (*seq & 1) {
read_sequnlock_excl(lock);
return 0;
}
if (read_seqretry(lock, *seq)) {
read_seqlock_excl(lock);
*seq = 1;
return 1;
}
return 0;
}
#define __XXX(lock, seq, lockless) \
for (int lockless = 1, seq; xxx(lock, &seq, lockless); lockless = 0)
#define XXX(lock) \
__XXX(lock, __UNIQUE_ID(seq), __UNIQUE_ID(lockless))
?
This way the users can simply do:
seqlock_t sl;
void func(void)
{
XXX(&sl) {
... read-side critical section ...
}
}
using only the new XXX() helper. No need to declare/initialize seq, no need
for need_seqretry/done_seqretry.
What do you think?
Oleg.
Powered by blists - more mailing lists