[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <172224924751.2215.13825232512894013809.tip-bot2@tip-bot2>
Date: Mon, 29 Jul 2024 10:34:07 -0000
From: "tip-bot2 for Peilin He" <tip-bot2@...utronix.de>
To: linux-tip-commits@...r.kernel.org
Cc: Peter Zijlstra <peterz@...radead.org>, Peilin He <he.peilin@....com.cn>,
xu xin <xu.xin16@....com.cn>, Yunkai Zhang <zhang.yunkai@....com.cn>,
Qiang Tu <tu.qiang35@....com.cn>, Kun Jiang <jiang.kun2@....com.cn>,
Fan Yu <fan.yu9@....com.cn>, x86@...nel.org, linux-kernel@...r.kernel.org
Subject: [tip: sched/core] sched/core: Add WARN_ON_ONCE() to check overflow
for migrate_disable()
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 0ec8d5aed4d14055aab4e2746def33f8b0d409c3
Gitweb: https://git.kernel.org/tip/0ec8d5aed4d14055aab4e2746def33f8b0d409c3
Author: Peilin He <he.peilin@....com.cn>
AuthorDate: Tue, 16 Jul 2024 10:42:44 +08:00
Committer: Peter Zijlstra <peterz@...radead.org>
CommitterDate: Mon, 29 Jul 2024 12:22:34 +02:00
sched/core: Add WARN_ON_ONCE() to check overflow for migrate_disable()
Background
==========
When repeated migrate_disable() calls are made with missing the
corresponding migrate_enable() calls, there is a risk of
'migration_disabled' going upper overflow because
'migration_disabled' is a type of unsigned short whose max value is
65535.
In PREEMPT_RT kernel, if 'migration_disabled' goes upper overflow, it may
make the migrate_disable() ineffective within local_lock_irqsave(). This
is because, during the scheduling procedure, the value of
'migration_disabled' will be checked, which can trigger CPU migration.
Consequently, the count of 'rcu_read_lock_nesting' may leak due to
local_lock_irqsave() and local_unlock_irqrestore() occurring on different
CPUs.
Usecase
========
For example, When I developed a driver, I encountered a warning like
"WARNING: CPU: 4 PID: 260 at kernel/rcu/tree_plugin.h:315
rcu_note_context_switch+0xa8/0x4e8" warning. It took me half a month
to locate this issue. Ultimately, I discovered that the lack of upper
overflow detection mechanism in migrate_disable() was the root cause,
leading to a significant amount of time spent on problem localization.
If the upper overflow detection mechanism was added to migrate_disable(),
the root cause could be very quickly and easily identified.
Effect
======
Using WARN_ON_ONCE() to check if 'migration_disabled' is upper overflow
can help developers identify the issue quickly.
Suggested-by: Peter Zijlstra <peterz@...radead.org>
Signed-off-by: Peilin He<he.peilin@....com.cn>
Signed-off-by: xu xin <xu.xin16@....com.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Reviewed-by: Yunkai Zhang <zhang.yunkai@....com.cn>
Reviewed-by: Qiang Tu <tu.qiang35@....com.cn>
Reviewed-by: Kun Jiang <jiang.kun2@....com.cn>
Reviewed-by: Fan Yu <fan.yu9@....com.cn>
Link: https://lkml.kernel.org/r/20240716104244764N2jD8gnBpnsLjCDnQGQ8c@zte.com.cn
---
kernel/sched/core.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2c61b4f..db5823f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2233,6 +2233,12 @@ void migrate_disable(void)
struct task_struct *p = current;
if (p->migration_disabled) {
+#ifdef CONFIG_DEBUG_PREEMPT
+ /*
+ *Warn about overflow half-way through the range.
+ */
+ WARN_ON_ONCE((s16)p->migration_disabled < 0);
+#endif
p->migration_disabled++;
return;
}
@@ -2251,14 +2257,20 @@ void migrate_enable(void)
.flags = SCA_MIGRATE_ENABLE,
};
+#ifdef CONFIG_DEBUG_PREEMPT
+ /*
+ * Check both overflow from migrate_disable() and superfluous
+ * migrate_enable().
+ */
+ if (WARN_ON_ONCE((s16)p->migration_disabled <= 0))
+ return;
+#endif
+
if (p->migration_disabled > 1) {
p->migration_disabled--;
return;
}
- if (WARN_ON_ONCE(!p->migration_disabled))
- return;
-
/*
* Ensure stop_task runs either before or after this, and that
* __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule().
Powered by blists - more mailing lists