[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aJtCSjsCEtN1csjg@gpd4>
Date: Tue, 12 Aug 2025 15:31:54 +0200
From: Andrea Righi <arighi@...dia.com>
To: Jake Hillion <jake@...lion.co.uk>
Cc: Tejun Heo <tj@...nel.org>, Christian Loehle <christian.loehle@....com>,
void@...ifault.com, linux-kernel@...r.kernel.org,
sched-ext@...ts.linux.dev, changwoo@...lia.com, hodgesd@...a.com,
mingo@...hat.com, peterz@...radead.org
Subject: Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
On Mon, Aug 11, 2025 at 03:35:05PM +0100, Jake Hillion wrote:
> On Sun, Aug 10, 2025 at 12:52:53PM +0200, Andrea Righi wrote:
> > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> > we force schedulers to check for NULL and, if they don't, the verifier
> > won't be happy, so this already breaks existing binaries.
>
> I ran some testing on the sched_ext for-next branch, and scx_cosmos is
> breaking in cosmos_init including the latest changes. I believe it kicks
> off a timer in init, which indirectly calls
> `scx_bpf_cpu_rq(cpu)->curr->flags & PF_IDLE`. This should be NULL
> checked, but old binaries breaking is pretty inconvenient for new users.
>
> As Andrea says, this is the already merged patch triggering this.
We should provide a compat helper in common.bpf.h and fix the schedulers to
use this helper. Something like the following (untested):
static inline struct task_struct *
__COMPAT_scx_bpf_task_acquire_remote_curr(s32 cpu)
{
struct rq *rq;
if (bpf_ksym_exists(scx_bpf_task_acquire_remote_curr)
return scx_bpf_task_acquire_remote_curr(cpu);
rq = scx_bpf_cpu_rq(cpu);
return rq ? rq->curr : NULL;
}
Then we can drop this after a couple of kernel releases (like in v6.20).
-Andrea
Powered by blists - more mailing lists