linux-kernel - Re: [PATCH 10/24] sched/uclamg: Handle delayed dequeue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7ed142f6-ec00-4cd2-ac26-eff198d6a4d4@arm.com>
Date: Tue, 20 Aug 2024 17:23:03 +0100
From: Hongyan Xia <hongyan.xia2@....com>
To: Peter Zijlstra <peterz@...radead.org>, mingo@...hat.com,
 juri.lelli@...hat.com, vincent.guittot@...aro.org, dietmar.eggemann@....com,
 rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
 vschneid@...hat.com, linux-kernel@...r.kernel.org
Cc: kprateek.nayak@....com, wuyun.abel@...edance.com,
 youssefesmat@...omium.org, tglx@...utronix.de, efault@....de,
 Luis Machado <luis.machado@....com>
Subject: Re: [PATCH 10/24] sched/uclamg: Handle delayed dequeue

On 27/07/2024 11:27, Peter Zijlstra wrote:
> Delayed dequeue has tasks sit around on the runqueue that are not
> actually runnable -- specifically, they will be dequeued the moment
> they get picked.
> 
> One side-effect is that such a task can get migrated, which leads to a
> 'nested' dequeue_task() scenario that messes up uclamp if we don't
> take care.
> 
> Notably, dequeue_task(DEQUEUE_SLEEP) can 'fail' and keep the task on
> the runqueue. This however will have removed the task from uclamp --
> per uclamp_rq_dec() in dequeue_task(). So far so good.
> 
> However, if at that point the task gets migrated -- or nice adjusted
> or any of a myriad of operations that does a dequeue-enqueue cycle --
> we'll pass through dequeue_task()/enqueue_task() again. Without
> modification this will lead to a double decrement for uclamp, which is
> wrong.
> 
> Reported-by: Luis Machado <luis.machado@....com>
> Reported-by: Hongyan Xia <hongyan.xia2@....com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
>   kernel/sched/core.c |   16 +++++++++++++++-
>   1 file changed, 15 insertions(+), 1 deletion(-)
> 
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1676,6 +1676,9 @@ static inline void uclamp_rq_inc(struct
>   	if (unlikely(!p->sched_class->uclamp_enabled))
>   		return;
>   
> +	if (p->se.sched_delayed)
> +		return;
> +
>   	for_each_clamp_id(clamp_id)
>   		uclamp_rq_inc_id(rq, p, clamp_id);
>   
> @@ -1700,6 +1703,9 @@ static inline void uclamp_rq_dec(struct
>   	if (unlikely(!p->sched_class->uclamp_enabled))
>   		return;
>   
> +	if (p->se.sched_delayed)
> +		return;
> +
>   	for_each_clamp_id(clamp_id)
>   		uclamp_rq_dec_id(rq, p, clamp_id);
>   }
> @@ -1979,8 +1985,12 @@ void enqueue_task(struct rq *rq, struct
>   		psi_enqueue(p, (flags & ENQUEUE_WAKEUP) && !(flags & ENQUEUE_MIGRATED));
>   	}
>   
> -	uclamp_rq_inc(rq, p);
>   	p->sched_class->enqueue_task(rq, p, flags);
> +	/*
> +	 * Must be after ->enqueue_task() because ENQUEUE_DELAYED can clear
> +	 * ->sched_delayed.
> +	 */
> +	uclamp_rq_inc(rq, p);

Apart from the typo in the title, this is a notable functional change.

Both classes that support uclamp update the CPU frequency in 
enqueue_task(). Before, a task that have uclamp_min will immediately 
drive up the frequency the moment it is enqueued. Now, driving up the 
frequency is delayed until the next util update.

I do not yet have evidence suggesting this is quantitatively bad, like 
first frame drops, but we might want to keep an eye on this, and switch 
back to the old way if possible.

>   
>   	if (sched_core_enabled(rq))
>   		sched_core_enqueue(rq, p);
> @@ -2002,6 +2012,10 @@ inline bool dequeue_task(struct rq *rq,
>   		psi_dequeue(p, flags & DEQUEUE_SLEEP);
>   	}
>   
> +	/*
> +	 * Must be before ->dequeue_task() because ->dequeue_task() can 'fail'
> +	 * and mark the task ->sched_delayed.
> +	 */
>   	uclamp_rq_dec(rq, p);
>   	return p->sched_class->dequeue_task(rq, p, flags);
>   }
> 
>