[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eab45ccf-c9f5-4dff-bc36-40133783d369@kylinos.cn>
Date: Thu, 3 Jul 2025 18:14:12 +0800
From: Zihuan Zhang <zhangzihuan@...inos.cn>
To: Xuewen Yan <xuewen.yan94@...il.com>
Cc: xuewen.yan@...soc.com, vincent.guittot@...aro.org, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, rostedt@...dmis.org,
bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
hongyan.xia2@....com, linux-kernel@...r.kernel.org, ke.wang@...soc.com,
di.shen@...soc.com, kprateek.nayak@....com, kuyo.chang@...iatek.com,
juju.sung@...iatek.com, qyousef@...alina.io
Subject: Re: [PATCH v1] sched/uclamp: Exclude kernel threads from uclamp logic
Hi Xuewen,
Thanks for your feedback — that makes a lot of sense.
在 2025/7/3 17:42, Xuewen Yan 写道:
> Hi zihuan,
>
> On Thu, Jul 3, 2025 at 5:15 PM Zihuan Zhang <zhangzihuan@...inos.cn> wrote:
>> Kernel threads (PF_KTHREAD) are not subject to user-defined utilization
>> clamping. They do not represent user workloads and should not participate
>> in any uclamp logic, including:
> Indeed, some driver would use set_scheduler() to set some kthread to
> improve performance.
> It is not a good idea to exclude it.
>
> Thanks!
>
I agree that kernel threads may need explicit scheduling control, so
it’s indeed not a good idea to exclude them unconditionally.
Our main concern was that uclamp_rq_inc() is a performance-sensitive
path, and letting default-initialized kthreads participate in clamp
aggregation could lead to unnecessary overhead and distort frequency
decisions.
We’ll rework the patch to be more selective — possibly skipping only
those kernel threads that don’t have user-defined clamp values.
Thanks again for the helpful input!
>> - clamp initialization during fork/post-fork
>> - effective clamp value computation
>> - runtime aggregation (uclamp_rq_inc/dec)
>>
>> Allowing kernel threads into these paths may pollute the rq->uclamp[]
>> statistics, mislead schedutil governor's frequency selection, and
>> complicate debugging or trace interpretation.
>>
>> This patch ensures that:
>> - uclamp_fork() and uclamp_post_fork() skip kernel threads
>> - uclamp_eff_value() return default values
>> - uclamp_rq_inc() and uclamp_rq_dec() skip kernel threads
>>
>> This aligns the semantics of uclamp with its original intent:
>> user-space task-specific clamping.
>>
>> dmesg in uclamp_rq_inc_id:
>> [ 76.373903] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:0 value:0 kthread:1
>> [ 76.375905] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:1 value:1024 kthread:1
>> [ 76.379837] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:0 value:0 kthread:1
>> [ 76.379839] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:1 value:1024 kthread:1
>> [ 76.379839] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:0 value:0 kthread:1
>> [ 76.379841] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:1 value:1024 kthread:1
>> [ 76.383897] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:0 value:0 kthread:1
>> [ 76.383897] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:0 value:0 kthread:1
>> [ 76.383900] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:1 value:1024 kthread:1
>> [ 76.383901] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:1 value:1024 kthread:1
>> [ 76.387885] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:0 value:0 kthread:1
>> [ 76.387885] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:0 value:0 kthread:1
>> [ 76.387888] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:1 value:1024 kthread:1
>> [ 76.387889] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:1 value:1024 kthread:1
>> [ 76.388139] uclamp_rq_inc_id: task:jbd2/sda3-8 pid:316 clamp_id:0 value:0 kthread:1
>> [ 76.388140] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:0 value:0 kthread:1
>> [ 76.388142] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:1 value:1024 kthread:1
>> [ 76.388143] uclamp_rq_inc_id: task:jbd2/sda3-8 pid:316 clamp_id:1 value:1024 kthread:1
>> [ 76.388169] uclamp_rq_inc_id: task:kworker/u48:6 pid:93 clamp_id:0 value:0 kthread:1
>> [ 76.388171] uclamp_rq_inc_id: task:kworker/u48:6 pid:93 clamp_id:1 value:1024 kthread:1
>> [ 76.388891] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:0 value:0 kthread:1
>> [ 76.388893] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:1 value:1024 kthread:1
>> [ 76.392900] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:0 value:0 kthread:1
>> [ 76.392902] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:1 value:1024 kthread:1
>> [ 76.398850] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:0 value:0 kthread:1
>> [ 76.398852] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:1 value:1024 kthread:1
>> [ 76.401880] uclamp_rq_inc_id: task:ksoftirqd/8 pid:67 clamp_id:0 value:0 kthread:1
>> [ 76.401883] uclamp_rq_inc_id: task:ksoftirqd/8 pid:67 clamp_id:1 value:1024 kthread:1
>> [ 76.409053] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:0 value:0 kthread:1
>> [ 76.409054] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:1 value:1024 kthread:1
>> [ 76.410881] uclamp_rq_inc_id: task:kworker/u48:10 pid:97 clamp_id:0 value:0 kthread:1
>> [ 76.410884] uclamp_rq_inc_id: task:kworker/u48:10 pid:97 clamp_id:1 value:1024 kthread:1
>> [ 76.419947] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:0 value:0 kthread:1
>> [ 76.419949] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:1 value:1024 kthread:1
>> [ 76.419976] uclamp_rq_inc_id: task:kworker/u48:6 pid:93 clamp_id:0 value:0 kthread:1
>> [ 76.419979] uclamp_rq_inc_id: task:kworker/u48:6 pid:93 clamp_id:1 value:1024 kthread:1
>> [ 76.420119] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:0 value:0 kthread:1
>> [ 76.420121] uclamp_rq_inc_id: task:kworker/2:1H pid:188 clamp_id:1 value:1024 kthread:1
>> [ 76.420642] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:0 value:0 kthread:1
>> [ 76.420644] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:1 value:1024 kthread:1
>> [ 76.434914] uclamp_rq_inc_id: task:kcompactd0 pid:108 clamp_id:0 value:0 kthread:1
>> [ 76.434916] uclamp_rq_inc_id: task:kcompactd0 pid:108 clamp_id:1 value:1024 kthread:1
>> [ 76.447689] uclamp_rq_inc_id: task:kworker/3:2 pid:244 clamp_id:0 value:0 kthread:1
>> [ 76.447691] uclamp_rq_inc_id: task:kworker/3:2 pid:244 clamp_id:1 value:1024 kthread:1
>> [ 76.447705] uclamp_rq_inc_id: task:ksoftirqd/3 pid:37 clamp_id:0 value:0 kthread:1
>> [ 76.447707] uclamp_rq_inc_id: task:ksoftirqd/3 pid:37 clamp_id:1 value:1024 kthread:1
>> [ 76.448809] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:0 value:0 kthread:1
>> [ 76.448811] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:1 value:1024 kthread:1
>> [ 76.451260] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:0 value:0 kthread:1
>> [ 76.451263] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:1 value:1024 kthread:1
>> [ 76.452806] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:0 value:0 kthread:1
>> [ 76.452808] uclamp_rq_inc_id: task:rcu_preempt pid:16 clamp_id:1 value:1024 kthread:1
>> [ 76.488052] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:0 value:0 kthread:1
>> [ 76.488054] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:1 value:1024 kthread:1
>> [ 76.488767] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:0 value:0 kthread:1
>> [ 76.488770] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:1 value:1024 kthread:1
>> [ 76.490847] uclamp_rq_inc_id: task:kworker/3:2 pid:244 clamp_id:0 value:0 kthread:1
>> [ 76.490848] uclamp_rq_inc_id: task:kworker/2:1 pid:143 clamp_id:0 value:0 kthread:1
>> [ 76.490849] uclamp_rq_inc_id: task:kworker/1:3 pid:462 clamp_id:0 value:0 kthread:1
>> [ 76.490848] uclamp_rq_inc_id: task:kworker/7:2 pid:687 clamp_id:0 value:0 kthread:1
>> [ 76.490849] uclamp_rq_inc_id: task:kworker/11:1 pid:146 clamp_id:0 value:0 kthread:1
>> [ 76.490850] uclamp_rq_inc_id: task:kworker/2:1 pid:143 clamp_id:1 value:1024 kthread:1
>> [ 76.490851] uclamp_rq_inc_id: task:kworker/3:2 pid:244 clamp_id:1 value:1024 kthread:1
>> [ 76.490851] uclamp_rq_inc_id: task:kworker/11:1 pid:146 clamp_id:1 value:1024 kthread:1
>> [ 76.490851] uclamp_rq_inc_id: task:kworker/7:2 pid:687 clamp_id:1 value:1024 kthread:1
>> [ 76.490853] uclamp_rq_inc_id: task:kworker/1:3 pid:462 clamp_id:1 value:1024 kthread:1
>> [ 76.490857] uclamp_rq_inc_id: task:kworker/5:1 pid:141 clamp_id:0 value:0 kthread:1
>> [ 76.490859] uclamp_rq_inc_id: task:kworker/5:1 pid:141 clamp_id:1 value:1024 kthread:1
>> [ 76.491850] uclamp_rq_inc_id: task:kworker/4:2 pid:534 clamp_id:0 value:0 kthread:1
>> [ 76.491852] uclamp_rq_inc_id: task:kworker/4:2 pid:534 clamp_id:1 value:1024 kthread:1
>> [ 76.504848] uclamp_rq_inc_id: task:kworker/10:2 pid:228 clamp_id:0 value:0 kthread:1
>> [ 76.504852] uclamp_rq_inc_id: task:kworker/10:2 pid:228 clamp_id:1 value:1024 kthread:1
>> [ 76.508785] uclamp_rq_inc_id: task:kworker/9:1 pid:142 clamp_id:0 value:0 kthread:1
>> [ 76.508787] uclamp_rq_inc_id: task:kworker/9:1 pid:142 clamp_id:1 value:1024 kthread:1
>> [ 76.514856] uclamp_rq_inc_id: task:kworker/u48:10 pid:97 clamp_id:0 value:0 kthread:1
>> [ 76.514859] uclamp_rq_inc_id: task:kworker/u48:10 pid:97 clamp_id:1 value:1024 kthread:1
>> [ 76.522742] uclamp_rq_inc_id: task:kworker/1:1H pid:153 clamp_id:0 value:0 kthread:1
>>
>> Signed-off-by: Zihuan Zhang <zhangzihuan@...inos.cn>
>> ---
>> kernel/sched/core.c | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 8988d38d46a3..a1e6b4157682 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -1630,6 +1630,9 @@ unsigned long uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id)
>> {
>> struct uclamp_se uc_eff;
>>
>> + if (p->flags & PF_KTHREAD)
>> + return uclamp_none(clamp_id);
>> +
>> /* Task currently refcounted: use back-annotated (effective) value */
>> if (p->uclamp[clamp_id].active)
>> return (unsigned long)p->uclamp[clamp_id].value;
>> @@ -1769,6 +1772,9 @@ static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p, int flags
>> if (unlikely(!p->sched_class->uclamp_enabled))
>> return;
>>
>> + if (p->flags & PF_KTHREAD)
>> + return;
>> +
>> /* Only inc the delayed task which being woken up. */
>> if (p->se.sched_delayed && !(flags & ENQUEUE_DELAYED))
>> return;
>> @@ -1797,6 +1803,9 @@ static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p)
>> if (unlikely(!p->sched_class->uclamp_enabled))
>> return;
>>
>> + if (p->flags & PF_KTHREAD)
>> + return;
>> +
>> if (p->se.sched_delayed)
>> return;
>>
>> @@ -1977,6 +1986,8 @@ static void uclamp_fork(struct task_struct *p)
>> {
>> enum uclamp_id clamp_id;
>>
>> + if (p->flags & PF_KTHREAD)
>> + return;
>> /*
>> * We don't need to hold task_rq_lock() when updating p->uclamp_* here
>> * as the task is still at its early fork stages.
>> @@ -1995,6 +2006,8 @@ static void uclamp_fork(struct task_struct *p)
>>
>> static void uclamp_post_fork(struct task_struct *p)
>> {
>> + if (p->flags & PF_KTHREAD)
>> + return;
>> uclamp_update_util_min_rt_default(p);
>> }
>>
>> --
>> 2.25.1
>>
>>
Best regards,
Zihuan
Powered by blists - more mailing lists