lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20240628125404912pr89b8ev3h97gu5cn280C@zte.com.cn>
Date: Fri, 28 Jun 2024 12:54:04 +0800 (CST)
From: <xu.xin16@....com.cn>
To: <mingo@...hat.com>, <peterz@...radead.org>, <juri.lelli@...hat.com>,
        <vincent.guittot@...aro.org>, <ietmar.eggemann@....com>,
        <ostedt@...dmis.org>, <bsegall@...gle.com>, <mgorman@...e.de>,
        <bristot@...hat.com>
Cc: <he.peilin@....com.cn>, <yang.yang29@....com.cn>, <tu.qiang35@....com.cn>,
        <jiang.kun2@....com.cn>, <xu.xin16@....com.cn>,
        <zhang.yunkai@....com.cn>, <liu.chun2@....com.cn>,
        <fan.yu9@....com.cn>, <linux-kernel@...r.kernel.org>
Subject: [PATCH linux-next] sched/core: Add WARN() to checks in migrate_disable()

From: Peilin He <he.peilin@....com.cn>

Background
==========
When repeated migrate_disable() calls are made with missing 
the corresponding migrate_enable() calls, there is a risk of 
'migration_disabled' going upper overflow because 
'migration_disabled' is a type of unsigned short whose max
value is 65535.

In PREEMPT_RT kernel, if 'migration_disabled' goes upper
overflow, it may make the migrate_disable() ineffective 
within local_lock_irqsave(). This is because, during the 
scheduling procedure, the value of 'migration_disabled' will be 
checked, which can trigger CPU migration. Consequently, 
the count of 'rcu_read_lock_nesting' may leak due to 
local_lock_irqsave() and local_unlock_irqrestore() occurring on 
different CPUs.

Usecase
========
For example, When I developed a driver, I encountered 
a "WARNING: CPU: 4 PID: 260 at kernel/rcu/tree_plugin.h:315 
rcu_note_context_switch+0xa8/0x4e8" warning. It took me 
half a month to locate this issue. Ultimately, I discovered 
that the lack of upper overflow detection mechanism in 
migrate_disable() was the root cause, leading to a significant 
amount of time spent on problem localization.

If the upper overflow detection mechanism was added to 
migrate_disable(), the root cause could be very quickly and 
easily identified.

Effect
======
Using WARN() to check if 'migration_disabled' is upper overflow 
can help developers quickly identify the issue.

Signed-off-by: Peilin He<he.peilin@....com.cn>
Signed-off-by: xu xin <xu.xin16@....com.cn>
Reviewed-by: Yunkai Zhang <zhang.yunkai@....com.cn>
Reviewed-by: Qiang Tu <tu.qiang35@....com.cn>
Reviewed-by: Kun Jiang <jiang.kun2@....com.cn>
Reviewed-by: Fan Yu <fan.yu9@....com.cn>
Cc: Yang Yang <yang.yang29@....com.cn>
Cc: Liu Chun <liu.chun2@....com.cn>
---
 kernel/sched/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8cc4975d6b2b..14671291564c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2259,6 +2259,8 @@ void migrate_disable(void)
 	struct task_struct *p = current;

 	if (p->migration_disabled) {
+		if (p->migration_disabled == USHRT_MAX)
+			WARN(1, "migration_disabled has encountered an overflow.\n");
 		p->migration_disabled++;
 		return;
 	}
-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ