[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211202101238.33546-1-elver@google.com>
Date: Thu, 2 Dec 2021 11:12:38 +0100
From: Marco Elver <elver@...gle.com>
To: elver@...gle.com, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
Waiman Long <longman@...hat.com>,
Boqun Feng <boqun.feng@...il.com>, linux-kernel@...r.kernel.org
Cc: kasan-dev@...glegroups.com, Thomas Gleixner <tglx@...utronix.de>,
Mark Rutland <mark.rutland@....com>,
"Paul E. McKenney" <paulmck@...nel.org>
Subject: [PATCH] locking/mutex: Mark racy reads of owner->on_cpu
One of the more frequent data races reported by KCSAN is the racy read
in mutex_spin_on_owner(), which is usually reported as "race of unknown
origin" without showing the writer. This is due to the racing write
occurring in kernel/sched. Locally enabling KCSAN in kernel/sched shows:
| write (marked) to 0xffff97f205079934 of 4 bytes by task 316 on cpu 6:
| finish_task kernel/sched/core.c:4632 [inline]
| finish_task_switch kernel/sched/core.c:4848
| context_switch kernel/sched/core.c:4975 [inline]
| __schedule kernel/sched/core.c:6253
| schedule kernel/sched/core.c:6326
| schedule_preempt_disabled kernel/sched/core.c:6385
| __mutex_lock_common kernel/locking/mutex.c:680
| __mutex_lock kernel/locking/mutex.c:740 [inline]
| __mutex_lock_slowpath kernel/locking/mutex.c:1028
| mutex_lock kernel/locking/mutex.c:283
| tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
| ...
|
| read to 0xffff97f205079934 of 4 bytes by task 322 on cpu 3:
| mutex_spin_on_owner kernel/locking/mutex.c:370
| mutex_optimistic_spin kernel/locking/mutex.c:480
| __mutex_lock_common kernel/locking/mutex.c:610
| __mutex_lock kernel/locking/mutex.c:740 [inline]
| __mutex_lock_slowpath kernel/locking/mutex.c:1028
| mutex_lock kernel/locking/mutex.c:283
| tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
| ...
|
| value changed: 0x00000001 -> 0x00000000
This race is clearly intentional, and the potential for miscompilation
is slim due to surrounding barrier() and cpu_relax(), and the value
being used as a boolean.
Nevertheless, marking this reader would more clearly denote intent and
make it obvious that concurrency is expected. Use READ_ONCE() to avoid
having to reason about compiler optimizations now and in future.
Similarly, mark the read to owner->on_cpu in mutex_can_spin_on_owner(),
which immediately precedes the loop executing mutex_spin_on_owner().
Signed-off-by: Marco Elver <elver@...gle.com>
---
I decided to send this out now due to the discussion at [1], because it
is one of the first things that people notice when enabling KCSAN.
[1] https://lkml.kernel.org/r/811af0bc-0c99-37f6-a39a-095418b10661@huawei.com
It had been reported before, but never with the 2nd stack trace -- so at
the very least this patch can now serve as a reference.
Thanks,
-- Marco
---
kernel/locking/mutex.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index db1913611192..50c03a3fa61e 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -367,7 +367,7 @@ bool mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner,
/*
* Use vcpu_is_preempted to detect lock holder preemption issue.
*/
- if (!owner->on_cpu || need_resched() ||
+ if (!READ_ONCE(owner->on_cpu) || need_resched() ||
vcpu_is_preempted(task_cpu(owner))) {
ret = false;
break;
@@ -410,7 +410,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock)
*/
if (owner)
- retval = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
+ retval = READ_ONCE(owner->on_cpu) && !vcpu_is_preempted(task_cpu(owner));
/*
* If lock->owner is not set, the mutex has been released. Return true
--
2.34.0.rc2.393.gf8c9666880-goog
Powered by blists - more mailing lists