[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aAQVRhs_y-dxC4yE@gpd3>
Date: Sat, 19 Apr 2025 23:27:34 +0200
From: Andrea Righi <arighi@...dia.com>
To: Tejun Heo <tj@...nel.org>
Cc: David Vernet <void@...ifault.com>, Changwoo Min <changwoo@...lia.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] sched_ext: Track currently locked rq
On Sat, Apr 19, 2025 at 10:30:37PM +0200, Andrea Righi wrote:
> On Sat, Apr 19, 2025 at 10:10:13PM +0200, Andrea Righi wrote:
> > On Sat, Apr 19, 2025 at 07:34:16AM -1000, Tejun Heo wrote:
> > > Hello, Andrea.
> > >
> > > On Sat, Apr 19, 2025 at 02:24:30PM +0200, Andrea Righi wrote:
> > > > @@ -149,6 +149,7 @@ struct sched_ext_entity {
> > > > s32 selected_cpu;
> > > > u32 kf_mask; /* see scx_kf_mask above */
> > > > struct task_struct *kf_tasks[2]; /* see SCX_CALL_OP_TASK() */
> > > > + struct rq *locked_rq; /* currently locked rq */
> > >
> > > Can this be a percpu variable? While rq is locked, current can't switch out
> > > anyway and that way we don't have to increase the size of task. Note that
> > > kf_tasks[] are different in that some ops may, at least theoretically,
> > > sleep.
> >
> > Yeah, I was debating between using a percpu variable or storing it in
> > current. I went with current just to stay consistent with kf_tasks.
> >
> > But you're right about not to increasing the size of the task, and as you
> > pointed out, we can’t switch if the rq is locked, so a percpu variable
> > should work. I’ll update that in v2.
>
> Hm... actually thinking more about this, a problem with the percpu variable
> is that, if no rq is locked, we could move to a different CPU and end up
> reading the wrong rq_locked via scx_locked_rq(). I don't think we want to
> preempt_disable/enable all the callbacks just to fix this... Maybe storing
> in current is a safer choice?
And if we don't want to increase the size of sched_ext_entity, we could
store the cpu of the currently locked rq, right before "disallow", like:
struct sched_ext_entity {
struct scx_dispatch_q * dsq; /* 0 8 */
struct scx_dsq_list_node dsq_list; /* 8 24 */
struct rb_node dsq_priq __attribute__((__aligned__(8))); /* 32 24 */
u32 dsq_seq; /* 56 4 */
u32 dsq_flags; /* 60 4 */
/* --- cacheline 1 boundary (64 bytes) --- */
u32 flags; /* 64 4 */
u32 weight; /* 68 4 */
s32 sticky_cpu; /* 72 4 */
s32 holding_cpu; /* 76 4 */
s32 selected_cpu; /* 80 4 */
u32 kf_mask; /* 84 4 */
struct task_struct * kf_tasks[2]; /* 88 16 */
atomic_long_t ops_state; /* 104 8 */
struct list_head runnable_node; /* 112 16 */
/* --- cacheline 2 boundary (128 bytes) --- */
long unsigned int runnable_at; /* 128 8 */
u64 core_sched_at; /* 136 8 */
u64 ddsp_dsq_id; /* 144 8 */
u64 ddsp_enq_flags; /* 152 8 */
u64 slice; /* 160 8 */
u64 dsq_vtime; /* 168 8 */
int locked_cpu; /* 176 4 */
bool disallow; /* 180 1 */
/* XXX 3 bytes hole, try to pack */
struct cgroup * cgrp_moving_from; /* 184 8 */
/* --- cacheline 3 boundary (192 bytes) --- */
struct list_head tasks_node; /* 192 16 */
/* size: 208, cachelines: 4, members: 24 */
/* sum members: 205, holes: 1, sum holes: 3 */
/* forced alignments: 1 */
/* last cacheline: 16 bytes */
} __attribute__((__aligned__(8)));
(before the hole was 7 bytes)
Then use cpu_rq()/cpu_of() to resolve that to/from the corresponding rq.
-Andrea
Powered by blists - more mailing lists