[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240716091639.GB26750@noisy.programming.kicks-ass.net>
Date: Tue, 16 Jul 2024 11:16:39 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: xu.xin16@....com.cn
Cc: bsegall@...gle.com, dietmar.eggemann@....com, fan.yu9@....com.cn,
he.peilin@....com.cn, jiang.kun2@....com.cn, juri.lelli@...hat.com,
linux-kernel@...r.kernel.org, liu.chun2@....com.cn, mgorman@...e.de,
mingo@...hat.com, rostedt@...dmis.org, tu.qiang35@....com.cn,
vincent.guittot@...aro.org, yang.yang29@....com.cn,
zhang.yunkai@....com.cn
Subject: Re: [PATCH linux-next v3 RESEND] sched/core: Add WARN_ON_ONCE() to
check overflow for migrate_disable
On Tue, Jul 16, 2024 at 10:42:44AM +0800, xu.xin16@....com.cn wrote:
> From: Peilin He <he.peilin@....com.cn>
>
> Background
> ==========
> When repeated migrate_disable() calls are made with missing the
> corresponding migrate_enable() calls, there is a risk of
> 'migration_disabled' going upper overflow because
> 'migration_disabled' is a type of unsigned short whose max value is
> 65535.
>
> In PREEMPT_RT kernel, if 'migration_disabled' goes upper overflow, it may
> make the migrate_disable() ineffective within local_lock_irqsave(). This
> is because, during the scheduling procedure, the value of
> 'migration_disabled' will be checked, which can trigger CPU migration.
> Consequently, the count of 'rcu_read_lock_nesting' may leak due to
> local_lock_irqsave() and local_unlock_irqrestore() occurring on different
> CPUs.
>
> Usecase
> ========
> For example, When I developed a driver, I encountered a warning like
> "WARNING: CPU: 4 PID: 260 at kernel/rcu/tree_plugin.h:315
> rcu_note_context_switch+0xa8/0x4e8" warning. It took me half a month
> to locate this issue. Ultimately, I discovered that the lack of upper
> overflow detection mechanism in migrate_disable() was the root cause,
> leading to a significant amount of time spent on problem localization.
>
> If the upper overflow detection mechanism was added to migrate_disable(),
> the root cause could be very quickly and easily identified.
>
> Effect
> ======
> Using WARN_ON_ONCE() to check if 'migration_disabled' is upper overflow
> can help developers identify the issue quickly.
>
> Signed-off-by: Peilin He<he.peilin@....com.cn>
> Signed-off-by: xu xin <xu.xin16@....com.cn>
> Reviewed-by: Yunkai Zhang <zhang.yunkai@....com.cn>
> Reviewed-by: Qiang Tu <tu.qiang35@....com.cn>
> Reviewed-by: Kun Jiang <jiang.kun2@....com.cn>
> Reviewed-by: Fan Yu <fan.yu9@....com.cn>
> Cc: Yang Yang <yang.yang29@....com.cn>
> Cc: Liu Chun <liu.chun2@....com.cn>
> Suggested-by: Peter Zijlstra <peterz@...radead.org>
Thanks, I'll queue this for sched/urgent once -rc1 rolls around.
Powered by blists - more mailing lists