[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aAYDTp-c3SV_Lnsz@gpd3>
Date: Mon, 21 Apr 2025 10:35:26 +0200
From: Andrea Righi <arighi@...dia.com>
To: Changwoo Min <changwoo@...lia.com>
Cc: Tejun Heo <tj@...nel.org>, David Vernet <void@...ifault.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] sched_ext: Fix missing rq lock in
scx_bpf_cpuperf_set()
Hi Changwoo,
On Mon, Apr 21, 2025 at 04:44:02PM +0900, Changwoo Min wrote:
> Hi Andrea,
>
> The patchset looks good to me. Thanks!
> I have one comment, that don't need to be addressed in this patch set,
> below:
>
> Acked-by: Changwoo Min <changwoo@...lia.com>
>
> On 4/21/25 04:30, Andrea Righi wrote:
> > scx_bpf_cpuperf_set() can be used to set a performance target level on
> > any CPU. However, it doesn't correctly acquire the corresponding rq
> > lock, which may lead to unsafe behavior and trigger the following
> > warning, due to the lockdep_assert_rq_held() check:
> >
> > [ 51.713737] WARNING: CPU: 3 PID: 3899 at kernel/sched/sched.h:1512 scx_bpf_cpuperf_set+0x1a0/0x1e0
> > ...
> > [ 51.713836] Call trace:
> > [ 51.713837] scx_bpf_cpuperf_set+0x1a0/0x1e0 (P)
> > [ 51.713839] bpf_prog_62d35beb9301601f_bpfland_init+0x168/0x440
> > [ 51.713841] bpf__sched_ext_ops_init+0x54/0x8c
> > [ 51.713843] scx_ops_enable.constprop.0+0x2c0/0x10f0
> > [ 51.713845] bpf_scx_reg+0x18/0x30
> > [ 51.713847] bpf_struct_ops_link_create+0x154/0x1b0
> > [ 51.713849] __sys_bpf+0x1934/0x22a0
> >
> > Fix by properly acquiring the rq lock when possible or raising an error
> > if we try to operate on a CPU that is not the one currently locked.
> >
> > Fixes: d86adb4fc0655 ("sched_ext: Add cpuperf support")
> > Signed-off-by: Andrea Righi <arighi@...dia.com>
> > ---
> > kernel/sched/ext.c | 27 +++++++++++++++++++++++----
> > 1 file changed, 23 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> > index 51dad94f1952d..6b6681b14488e 100644
> > --- a/kernel/sched/ext.c
> > +++ b/kernel/sched/ext.c
> > @@ -7088,13 +7088,32 @@ __bpf_kfunc void scx_bpf_cpuperf_set(s32 cpu, u32 perf)
> > }
> > if (ops_cpu_valid(cpu, NULL)) {
> > - struct rq *rq = cpu_rq(cpu);
> > + struct rq *rq = cpu_rq(cpu), *locked_rq = scx_locked_rq();
> > + struct rq_flags rf;
> > +
> > + /*
> > + * When called with an rq lock held, restrict the operation
> > + * to the corresponding CPU to prevent ABBA deadlocks.
> > + */
> > + if (locked_rq && rq != locked_rq) {
> > + scx_error("Invalid target CPU %d", cpu);
> > + return;
> > + }
>
>
> Considering almost all chext_ext ops hold an rq lock, the actual ops
> where scx_bpf_cpuperf_set() for a remote CPU is possible will be very
> limited. When there are more use cases for remote CPU kfuncs calls, I
> think we need to come up with some mechanism, for example, extending
> schedule_deferred() to cover more actions.
I agree, in this particular case, I think it's perfectly reasonable to
simply ops error when targeting a remote CPU, I don’t see any valid use
case for changing the cpufreq target of a remote CPU while we're holding a
rq lock.
We do have some users of scx_bpf_cpuperf_set() that invoke it without
holding any rq lock, for example, scx_rusty/wd40 and scx_bpfland can
set the cpuperf target from ops.init(), but this scenario is already
covered by this patchset.
In general, we may have APIs in the future that may need to modify
properties of a remote rq/cpu. In such cases, extending schedule_deferred()
sounds like the right path forward to me as well.
Thanks,
-Andrea
Powered by blists - more mailing lists