[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aRGi50r1BYcKTNJA@gpd4>
Date: Mon, 10 Nov 2025 09:31:35 +0100
From: Andrea Righi <arighi@...dia.com>
To: Tejun Heo <tj@...nel.org>
Cc: David Vernet <void@...ifault.com>, Changwoo Min <changwoo@...lia.com>,
Dan Schatzberg <schatzberg.dan@...il.com>,
Emil Tsalapatis <etsal@...a.com>, sched-ext@...ts.linux.dev,
linux-kernel@...r.kernel.org,
Douglas Anderson <dianders@...omium.org>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 10/13] sched_ext: Hook up hardlockup detector
On Sun, Nov 09, 2025 at 08:31:09AM -1000, Tejun Heo wrote:
> A poorly behaving BPF scheduler can trigger hard lockup. For example, on a
> large system with many tasks pinned to different subsets of CPUs, if the BPF
> scheduler puts all tasks in a single DSQ and lets all CPUs at it, the DSQ lock
> can be contended to the point where hardlockup triggers. Unfortunately,
> hardlockup can be the first signal out of such situations, thus requiring
> hardlockup handling.
>
> Hook scx_hardlockup() into the hardlockup detector to try kicking out the
> current scheduler in an attempt to recover the system to a good state. The
> handling strategy can delay watchdog taking its own action by one polling
> period; however, given that the only remediation for hardlockup is crash, this
> is likely an acceptable trade-off.
>
> Reported-by: Dan Schatzberg <schatzberg.dan@...il.com>
> Cc: Emil Tsalapatis <etsal@...a.com>
> Cc: Douglas Anderson <dianders@...omium.org>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Signed-off-by: Tejun Heo <tj@...nel.org>
Makes sense to me, from a sched_ext perspective:
Reviewed-by: Andrea Righi <arighi@...dia.com>
Thanks,
-Andrea
> ---
> include/linux/sched/ext.h | 1 +
> kernel/sched/ext.c | 18 ++++++++++++++++++
> kernel/watchdog.c | 9 +++++++++
> 3 files changed, 28 insertions(+)
>
> diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h
> index e1502faf6241..12561a3fcee4 100644
> --- a/include/linux/sched/ext.h
> +++ b/include/linux/sched/ext.h
> @@ -223,6 +223,7 @@ struct sched_ext_entity {
> void sched_ext_dead(struct task_struct *p);
> void print_scx_info(const char *log_lvl, struct task_struct *p);
> void scx_softlockup(u32 dur_s);
> +bool scx_hardlockup(void);
> bool scx_rcu_cpu_stall(void);
>
> #else /* !CONFIG_SCHED_CLASS_EXT */
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 4507bc4f0b5c..bd66178e5927 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -3710,6 +3710,24 @@ void scx_softlockup(u32 dur_s)
> smp_processor_id(), dur_s);
> }
>
> +/**
> + * scx_hardlockup - sched_ext hardlockup handler
> + *
> + * A poorly behaving BPF scheduler can trigger hard lockup by e.g. putting
> + * numerous affinitized tasks in a single queue and directing all CPUs at it.
> + * Try kicking out the current scheduler in an attempt to recover the system to
> + * a good state before taking more drastic actions.
> + */
> +bool scx_hardlockup(void)
> +{
> + if (!handle_lockup("hard lockup - CPU %d", smp_processor_id()))
> + return false;
> +
> + printk_deferred(KERN_ERR "sched_ext: Hard lockup - CPU %d, disabling BPF scheduler\n",
> + smp_processor_id());
> + return true;
> +}
> +
> /**
> * scx_bypass - [Un]bypass scx_ops and guarantee forward progress
> * @bypass: true for bypass, false for unbypass
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 5b62d1002783..8dfac4a8f587 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -196,6 +196,15 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
> #ifdef CONFIG_SYSFS
> ++hardlockup_count;
> #endif
> + /*
> + * A poorly behaving BPF scheduler can trigger hard lockup by
> + * e.g. putting numerous affinitized tasks in a single queue and
> + * directing all CPUs at it. The following call can return true
> + * only once when sched_ext is enabled and will immediately
> + * abort the BPF scheduler and print out a warning message.
> + */
> + if (scx_hardlockup())
> + return;
>
> /* Only print hardlockups once. */
> if (per_cpu(watchdog_hardlockup_warned, cpu))
> --
> 2.51.1
>
Powered by blists - more mailing lists