linux-kernel - Re: [PATCH v2 4/7] sched/fair: Add SHARED_RUNQ sched feature and skeleton calls

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230712213430.GE12207@maniforge>
Date:   Wed, 12 Jul 2023 16:34:30 -0500
From:   David Vernet <void@...ifault.com>
To:     Abel Wu <wuyun.abel@...edance.com>
Cc:     linux-kernel@...r.kernel.org, mingo@...hat.com,
        peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com, gautham.shenoy@....com,
        kprateek.nayak@....com, aaron.lu@...el.com, clm@...a.com,
        tj@...nel.org, roman.gushchin@...ux.dev, kernel-team@...a.com
Subject: Re: [PATCH v2 4/7] sched/fair: Add SHARED_RUNQ sched feature and
 skeleton calls

On Wed, Jul 12, 2023 at 04:39:03PM +0800, Abel Wu wrote:
> On 7/11/23 4:03 AM, David Vernet wrote:
> > @@ -6467,6 +6489,9 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> >   dequeue_throttle:
> >   	util_est_update(&rq->cfs, p, task_sleep);
> >   	hrtick_update(rq);
> > +
> > +	if (sched_feat(SHARED_RUNQ))
> > +		shared_runq_dequeue_task(p);
> 
> Would disabling SHARED_RUNQ causing task list nodes left
> in the shared stateful runqueue?

Hi Abel,

Yes, good call, there will be some stale tasks. The obvious way to get
around that would of course be to always call
shared_runq_dequeue_task(p) on the __dequeue_entity() path, but it would
be silly to tax a hot path in the scheduler in support of a feature
that's disabled by default.

At first I was thinking that the only issue would be some overhead in
clearing stale tasks once it was re-enabled, but that we'd be OK because
of this check in shared_runq_pick_next_task():

  298         if (task_on_rq_queued(p) && !task_on_cpu(rq, p)) {
  299                 update_rq_clock(src_rq);
  300                 src_rq = move_queued_task(src_rq, &src_rf, p, cpu_of(rq));
  301         }

So we wouldn't migrate tasks that weren't actually suitable. But that's
obviously wrong. It's not safe to keep stale tasks in that list for (at
least) two reasons.

- A task could exit (which would be easy to fix by just adding a dequeue
  call in task_dead_fair())
- We could have a double-add if a task is re-enqueued in the list after
  having been previously enqueued, but then never dequeued due to the
  timing of disabling SHARED_RUNQ.

Not sure what the best solution is here. We could always address this by
draining the list when the feature is disabled, but there's not yet a
mechanism to hook in a callback to be invoked when a scheduler feature
is enabled/disabled. It shouldn't be too hard to add that, assuming
other sched folks are amenable to it. It should just be a matter of
adding another __SCHED_FEAT_NR-sized table of NULL-able callbacks that
are invoked on enable / disable state change, and which can be specified
in a new SCHED_FEAT_CALLBACK or something macro.

Peter -- what do you think?

Thanks,
David