linux-kernel - Re: [PATCH] sched/core: Minor optimize pick_next

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230320113910.GI2194297@hirez.programming.kicks-ass.net>
Date:   Mon, 20 Mar 2023 12:39:10 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Hao Jia <jiahao.os@...edance.com>
Cc:     mingo@...hat.com, mingo@...nel.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com,
        mgorman@...hsingularity.net, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/core: Minor optimize pick_next_task() when
 core-sched enable

On Wed, Mar 08, 2023 at 06:04:13PM +0800, Hao Jia wrote:

> core max: task2 (cookie 0)
> 
> 	rq0				rq1
> task0(cookie non-zero)		task2(cookie 0)
> task1(cookie 0)
> task3(cookie 0)
> ...
> 
> pick-task: idle			pick-task: task2
> 
> CPU0 and CPU1 are two CPUs on the same core, task0 and task2 are the
> highest priority tasks on rq0 and rq1 respectively, task2 is @max
> on the entire core.

I'm assuming this all starts by rq0 doing a pick and getting task0.
Because any other selection would go down the whole !need_sync route.

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index af017e038b48..765cd14c52e1 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -236,8 +236,8 @@ void sched_core_enqueue(struct rq *rq, struct task_struct *p)
>  {
>  	rq->core->core_task_seq++;
>  
> -	if (!p->core_cookie)
> -		return;
> +	if (p->core_cookie)
> +		rq->cookied_count++;
>  
>  	rb_add(&p->core_node, &rq->core_tree, rb_sched_core_less);
>  }

> @@ -2061,14 +2066,12 @@ static inline void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
>  	uclamp_rq_inc(rq, p);
>  	p->sched_class->enqueue_task(rq, p, flags);
>  
> -	if (sched_core_enabled(rq))
> -		sched_core_enqueue(rq, p);
> +	sched_core_enqueue(rq, p);
>  }
>  
>  static inline void dequeue_task(struct rq *rq, struct task_struct *p, int flags)
>  {
> -	if (sched_core_enabled(rq))
> -		sched_core_dequeue(rq, p, flags);
> +	sched_core_dequeue(rq, p, flags);
>  
>  	if (!(flags & DEQUEUE_NOCLOCK))
>  		update_rq_clock(rq);

Yeah, this is an absolute no-no, it makes the overhead of the second rb
tree unconditional.