lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0e569d64-ce35-2176-5d41-faa6997480ef@bytedance.com>
Date:   Thu, 23 Mar 2023 15:01:59 +0800
From:   Hao Jia <jiahao.os@...edance.com>
To:     Vineeth Pillai <vineethrp@...gle.com>,
        Joel Fernandes <joel@...lfernandes.org>
Cc:     mingo@...hat.com, peterz@...radead.org, mingo@...nel.org,
        juri.lelli@...hat.com, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
        mgorman@...hsingularity.net, linux-kernel@...r.kernel.org,
        Josh Don <joshdon@...gle.com>
Subject: Re: [External] Re: [PATCH] sched/core: Minor optimize
 pick_next_task() when core-sched enable



On 2023/3/23 Vineeth Pillai wrote:
> Merging two threads.
> 
> On Tue, Mar 21, 2023 at 5:40 PM Joel Fernandes <joel@...lfernandes.org> wrote:
>>>
>>> CPU0 and CPU1 are two CPUs on the same core, task0 and task2 are the
>>> highest priority tasks on rq0 and rq1 respectively, task2 is @max
>>> on the entire core.
> 
>> I'm assuming this all starts by rq0 doing a pick and getting task0.
>> Because any other selection would go down the whole !need_sync route.
>>
> I think this could also happen when rq1 starts the pick due to task2 wakeup
> while task0 was running in rq0. In this case, core->core_cookie would be set
> and we take the need_sync path I guess.
> 
>>> In the case that 'max' has a zero cookie, instead of continuing to
>>> search for a runnable task on rq0 that matches @max's cookie, we
>>> choose idle for rq0 directly.
>>> At this time, it is obviously better to choose task1 to run for rq0,
>>> which will increase the CPU utilization.
>>> Therefore, we queue tasks with zero cookies in core_tree, and record
>>> the number of non-zero cookie tasks of each rq to detect the status
>>> of the sched-core.
>>
>> I do remember this as a known issue (more of a known but unimplemented
>> optimization) which happens when you have a high priority non-cookie
>> task which is in front of several low priority ones on the same
>> thread/rq. Adding +Vineeth Pillai to see if he remembers the issue.
>>
> Yes, I remember this as one of the 2 issues we noticed, but could not get to
> fixing it. Here we have non-cookied tasks considered special as a side effect
> of implementation(non-cookied tasks not in core rbtree) and hence we force idle
> if max is non-cookied and the highest prio task on the sibling is cookied.
> 
> The other issue was - we don't update core rbtree when vruntime changes and
> this can cause starvation of cookied task if there are more than one task with
> the same cookie on an rq.
> 

If I understand correctly, when a cookied task is enqueued, the 
difference delta1 between its vruntime and min_vruntime is very large.

Another task with the same cookie is very actively dequeuing and 
enqueuing, and the difference delta2 between its vruntime and 
min_vruntime is always smaller than delta1?
I'm not sure if this is the case?

>>>   static inline void dequeue_task(struct rq *rq, struct task_struct *p, int flags)
>>>   {
>>> - if (sched_core_enabled(rq))
>>> - sched_core_dequeue(rq, p, flags);
>>> + sched_core_dequeue(rq, p, flags);
>>>
>>>   if (!(flags & DEQUEUE_NOCLOCK))
>>>   update_rq_clock(rq);
> 
>> Yeah, this is an absolute no-no, it makes the overhead of the second rb
>> tree unconditional.
> 
> I agree. Could we keep it conditional by enqueuing 0-cookied tasks only when
> coresched is enabled, just like what we do for cookied tasks? This is still an
> overhead where we have two trees storing all the runnable tasks but in
> different order. We would also need to populate core rbtree from cfs rbtree
> on coresched enable and empty the tree on coresched disable.
> 

I'm not sure if the other way is reasonable, I'm trying to provide a 
function for each scheduling class to find a highest priority non-cookie 
task.

For example fair_sched_class, we can use rq->cfs_tasks to traverse the 
search. But this search may take a long time, maybe we need to limit the 
number of searches.

Thanks,
Hao

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ