linux-kernel - Re: [PATCH sched_ext/for-6.19] sched_ext: Pass locked CPU parameter to scx

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABFh=a7WYYmhO-q_VbSCc8c5E24gRzB05MG2Yc9+ev5TM14sMQ@mail.gmail.com>
Date: Thu, 13 Nov 2025 21:00:51 -0500
From: Emil Tsalapatis <linux-lists@...alapatis.com>
To: Tejun Heo <tj@...nel.org>
Cc: Doug Anderson <dianders@...omium.org>, David Vernet <void@...ifault.com>, 
	Andrea Righi <andrea.righi@...ux.dev>, Changwoo Min <changwoo@...lia.com>, 
	Dan Schatzberg <schatzberg.dan@...il.com>, Emil Tsalapatis <etsal@...a.com>, sched-ext@...ts.linux.dev, 
	linux-kernel@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>, 
	Andrea Righi <arighi@...dia.com>
Subject: Re: [PATCH sched_ext/for-6.19] sched_ext: Pass locked CPU parameter
 to scx_hardlockup() and add docs

On Thu, Nov 13, 2025 at 8:34 PM Tejun Heo <tj@...nel.org> wrote:
>
> With the buddy lockup detector, smp_processor_id() returns the detecting CPU,
> not the locked CPU, making scx_hardlockup()'s printouts confusing. Pass the
> locked CPU number from watchdog_hardlockup_check() as a parameter instead.
>
> Also add kerneldoc comments to handle_lockup(), scx_hardlockup(), and
> scx_rcu_cpu_stall() documenting their return value semantics.
>
> Suggested-by: Doug Anderson <dianders@...omium.org>
> Signed-off-by: Tejun Heo <tj@...nel.org>
> ---

Reviewed-by: Emil Tsalapatis <emil@...alapatis.com>

>  include/linux/sched/ext.h |    4 ++--
>  kernel/sched/ext.c        |   25 ++++++++++++++++++++++---
>  kernel/watchdog.c         |    2 +-
>  3 files changed, 25 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h
> index 70ee5c28a74d..bcb962d5ee7d 100644
> --- a/include/linux/sched/ext.h
> +++ b/include/linux/sched/ext.h
> @@ -230,7 +230,7 @@ struct sched_ext_entity {
>  void sched_ext_dead(struct task_struct *p);
>  void print_scx_info(const char *log_lvl, struct task_struct *p);
>  void scx_softlockup(u32 dur_s);
> -bool scx_hardlockup(void);
> +bool scx_hardlockup(int cpu);
>  bool scx_rcu_cpu_stall(void);
>
>  #else  /* !CONFIG_SCHED_CLASS_EXT */
> @@ -238,7 +238,7 @@ bool scx_rcu_cpu_stall(void);
>  static inline void sched_ext_dead(struct task_struct *p) {}
>  static inline void print_scx_info(const char *log_lvl, struct task_struct *p) {}
>  static inline void scx_softlockup(u32 dur_s) {}
> -static inline bool scx_hardlockup(void) { return false; }
> +static inline bool scx_hardlockup(int cpu) { return false; }
>  static inline bool scx_rcu_cpu_stall(void) { return false; }
>
>  #endif /* CONFIG_SCHED_CLASS_EXT */
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 8a3b8f64a06b..918573f3f088 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -3687,6 +3687,17 @@ bool scx_allow_ttwu_queue(const struct task_struct *p)
>         return false;
>  }
>
> +/**
> + * handle_lockup - sched_ext common lockup handler
> + * @fmt: format string
> + *
> + * Called on system stall or lockup condition and initiates abort of sched_ext
> + * if enabled, which may resolve the reported lockup.
> + *
> + * Returns %true if sched_ext is enabled and abort was initiated, which may
> + * resolve the lockup. %false if sched_ext is not enabled or abort was already
> + * initiated by someone else.
> + */
>  static __printf(1, 2) bool handle_lockup(const char *fmt, ...)
>  {
>         struct scx_sched *sch;
> @@ -3718,6 +3729,10 @@ static __printf(1, 2) bool handle_lockup(const char *fmt, ...)
>   * that may not be caused by the current BPF scheduler, try kicking out the
>   * current scheduler in an attempt to recover the system to a good state before
>   * issuing panics.
> + *
> + * Returns %true if sched_ext is enabled and abort was initiated, which may
> + * resolve the reported RCU stall. %false if sched_ext is not enabled or someone
> + * else already initiated abort.
>   */
>  bool scx_rcu_cpu_stall(void)
>  {
> @@ -3750,14 +3765,18 @@ void scx_softlockup(u32 dur_s)
>   * numerous affinitized tasks in a single queue and directing all CPUs at it.
>   * Try kicking out the current scheduler in an attempt to recover the system to
>   * a good state before taking more drastic actions.
> + *
> + * Returns %true if sched_ext is enabled and abort was initiated, which may
> + * resolve the reported hardlockdup. %false if sched_ext is not enabled or
> + * someone else already initiated abort.
>   */
> -bool scx_hardlockup(void)
> +bool scx_hardlockup(int cpu)
>  {
> -       if (!handle_lockup("hard lockup - CPU %d", smp_processor_id()))
> +       if (!handle_lockup("hard lockup - CPU %d", cpu))
>                 return false;
>
>         printk_deferred(KERN_ERR "sched_ext: Hard lockup - CPU %d, disabling BPF scheduler\n",
> -                       smp_processor_id());
> +                       cpu);
>         return true;
>  }
>
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 8dfac4a8f587..873020a2a581 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -203,7 +203,7 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
>                  * only once when sched_ext is enabled and will immediately
>                  * abort the BPF scheduler and print out a warning message.
>                  */
> -               if (scx_hardlockup())
> +               if (scx_hardlockup(cpu))
>                         return;
>
>                 /* Only print hardlockups once. */
>