[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8d146099-a12a-c5a1-4829-dec95497fdca@google.com>
Date: Fri, 2 Dec 2022 12:08:27 -0500
From: Barret Rhoden <brho@...gle.com>
To: Tejun Heo <tj@...nel.org>
Cc: torvalds@...ux-foundation.org, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
bristot@...hat.com, vschneid@...hat.com, ast@...nel.org,
daniel@...earbox.net, andrii@...nel.org, martin.lau@...nel.org,
joshdon@...gle.com, pjt@...gle.com, derkling@...gle.com,
haoluo@...gle.com, dvernet@...a.com, dschatzberg@...a.com,
dskarlat@...cmu.edu, riel@...riel.com,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
kernel-team@...a.com
Subject: Re: [PATCH 14/31] sched_ext: Implement BPF extensible scheduler class
hi -
On 11/30/22 03:22, Tejun Heo wrote:
[...]
> +static bool consume_dispatch_q(struct rq *rq, struct rq_flags *rf,
> + struct scx_dispatch_q *dsq)
> +{
> + struct scx_rq *scx_rq = &rq->scx;
> + struct task_struct *p;
> + struct rq *task_rq;
> + bool moved = false;
> +retry:
> + if (list_empty(&dsq->fifo))
> + return false;
> +
> + raw_spin_lock(&dsq->lock);
> + list_for_each_entry(p, &dsq->fifo, scx.dsq_node) {
> + task_rq = task_rq(p);
> + if (rq == task_rq)
> + goto this_rq;
> + if (likely(rq->online) && !is_migration_disabled(p) &&
> + cpumask_test_cpu(cpu_of(rq), p->cpus_ptr))
> + goto remote_rq;
> + }
> + raw_spin_unlock(&dsq->lock);
> + return false;
> +
> +this_rq:
> + /* @dsq is locked and @p is on this rq */
> + WARN_ON_ONCE(p->scx.holding_cpu >= 0);
> + list_move_tail(&p->scx.dsq_node, &scx_rq->local_dsq.fifo);
> + dsq->nr--;
> + scx_rq->local_dsq.nr++;
> + p->scx.dsq = &scx_rq->local_dsq;
> + raw_spin_unlock(&dsq->lock);
> + return true;
> +
> +remote_rq:
> +#ifdef CONFIG_SMP
> + /*
> + * @dsq is locked and @p is on a remote rq. @p is currently protected by
> + * @dsq->lock. We want to pull @p to @rq but may deadlock if we grab
> + * @task_rq while holding @dsq and @rq locks. As dequeue can't drop the
> + * rq lock or fail, do a little dancing from our side. See
> + * move_task_to_local_dsq().
> + */
> + WARN_ON_ONCE(p->scx.holding_cpu >= 0);
> + list_del_init(&p->scx.dsq_node);
> + dsq->nr--;
> + p->scx.holding_cpu = raw_smp_processor_id();
> + raw_spin_unlock(&dsq->lock);
> +
> + rq_unpin_lock(rq, rf);
> + double_lock_balance(rq, task_rq);
> + rq_repin_lock(rq, rf);
> +
> + moved = move_task_to_local_dsq(rq, p);
you might be able to avoid the double_lock_balance() by using
move_queued_task(), which internally hands off the old rq lock and
returns with the new rq lock.
the pattern for consume_dispatch_q() would be something like:
- kfunc from bpf, with this_rq lock held
- notice p isn't on this_rq, goto remote_rq:
- do sched_ext accounting, like the this_rq->dsq->nr--
- unlock this_rq
- p_rq = task_rq_lock(p)
- double_check p->rq didn't change to this_rq during that unlock
- new_rq = move_queued_task(p_rq, rf, p, new_cpu)
- do sched_ext accounting like new_rq->dsq->nr++
- unlock new_rq
- relock the original this_rq
- return to bpf
you still end up grabbing both locks, but just not at the same time.
plus, task_rq_lock() takes the guesswork out of whether you're getting
p's rq lock or not. it looks like you're using the holding_cpu to
handle the race where p moves cpus after you read task_rq(p) but before
you lock that task_rq. maybe you can drop the whole concept of the
holding_cpu?
thanks,
barret
> +
> + double_unlock_balance(rq, task_rq);
> +#endif /* CONFIG_SMP */
> + if (likely(moved))
> + return true;
> + goto retry;
> +}
Powered by blists - more mailing lists