linux-kernel - Re: [PATCH] sched/core: Minor optimize pick_next

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+HDTgT0sjt38E4-2uQs_2t1GSsYFDqz3porOx-WQbt8x9hhXw@mail.gmail.com>
Date:   Wed, 22 Mar 2023 16:35:20 -0400
From:   Vineeth Pillai <vineethrp@...gle.com>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     Hao Jia <jiahao.os@...edance.com>, mingo@...hat.com,
        peterz@...radead.org, mingo@...nel.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com,
        mgorman@...hsingularity.net, linux-kernel@...r.kernel.org,
        Josh Don <joshdon@...gle.com>
Subject: Re: [PATCH] sched/core: Minor optimize pick_next_task() when
 core-sched enable

Merging two threads.

On Tue, Mar 21, 2023 at 5:40 PM Joel Fernandes <joel@...lfernandes.org> wrote:
> >
> > CPU0 and CPU1 are two CPUs on the same core, task0 and task2 are the
> > highest priority tasks on rq0 and rq1 respectively, task2 is @max
> > on the entire core.

> I'm assuming this all starts by rq0 doing a pick and getting task0.
> Because any other selection would go down the whole !need_sync route.
>
I think this could also happen when rq1 starts the pick due to task2 wakeup
while task0 was running in rq0. In this case, core->core_cookie would be set
and we take the need_sync path I guess.

> > In the case that 'max' has a zero cookie, instead of continuing to
> > search for a runnable task on rq0 that matches @max's cookie, we
> > choose idle for rq0 directly.
> > At this time, it is obviously better to choose task1 to run for rq0,
> > which will increase the CPU utilization.
> > Therefore, we queue tasks with zero cookies in core_tree, and record
> > the number of non-zero cookie tasks of each rq to detect the status
> > of the sched-core.
>
> I do remember this as a known issue (more of a known but unimplemented
> optimization) which happens when you have a high priority non-cookie
> task which is in front of several low priority ones on the same
> thread/rq. Adding +Vineeth Pillai to see if he remembers the issue.
>
Yes, I remember this as one of the 2 issues we noticed, but could not get to
fixing it. Here we have non-cookied tasks considered special as a side effect
of implementation(non-cookied tasks not in core rbtree) and hence we force idle
if max is non-cookied and the highest prio task on the sibling is cookied.

The other issue was - we don't update core rbtree when vruntime changes and
this can cause starvation of cookied task if there are more than one task with
the same cookie on an rq.

> >  static inline void dequeue_task(struct rq *rq, struct task_struct *p, int flags)
> >  {
> > - if (sched_core_enabled(rq))
> > - sched_core_dequeue(rq, p, flags);
> > + sched_core_dequeue(rq, p, flags);
> >
> >  if (!(flags & DEQUEUE_NOCLOCK))
> >  update_rq_clock(rq);

> Yeah, this is an absolute no-no, it makes the overhead of the second rb
> tree unconditional.

I agree. Could we keep it conditional by enqueuing 0-cookied tasks only when
coresched is enabled, just like what we do for cookied tasks? This is still an
overhead where we have two trees storing all the runnable tasks but in
different order. We would also need to populate core rbtree from cfs rbtree
on coresched enable and empty the tree on coresched disable.

Thanks,
Vineeth