[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <13467b08-96a6-478d-bb97-060b7d8887e4@amd.com>
Date: Wed, 3 Sep 2025 16:01:03 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Aaron Lu <ziqianlu@...edance.com>
CC: Valentin Schneider <vschneid@...hat.com>, Ben Segall <bsegall@...gle.com>,
Peter Zijlstra <peterz@...radead.org>, Chengming Zhou
<chengming.zhou@...ux.dev>, Josh Don <joshdon@...gle.com>, Ingo Molnar
<mingo@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>, Xi Wang
<xii@...gle.com>, <linux-kernel@...r.kernel.org>, Juri Lelli
<juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>, "Steven
Rostedt" <rostedt@...dmis.org>, Mel Gorman <mgorman@...e.de>, Chuyi Zhou
<zhouchuyi@...edance.com>, Jan Kiszka <jan.kiszka@...mens.com>, "Florian
Bezdeka" <florian.bezdeka@...mens.com>, Songtang Liu
<liusongtang@...edance.com>, Sebastian Andrzej Siewior
<bigeasy@...utronix.de>
Subject: Re: [PATCH v3 3/5] sched/fair: Switch to task based throttle model
Hello Aaron,
On 9/3/2025 3:41 PM, Aaron Lu wrote:
> Hi Prateek,
>
> On Wed, Sep 03, 2025 at 02:41:55PM +0530, K Prateek Nayak wrote:
>> Hello Aaron,
>>
>> On 9/3/2025 12:44 PM, Aaron Lu wrote:
>>> On Fri, Aug 22, 2025 at 07:07:01PM +0800, Aaron Lu wrote:
>>>> With this said, I reduced the task number and retested on this 2nd AMD
>>>> Genoa:
>>>> - quota set to 50 cpu for each level1 cgroup;
>>
>> What exactly is the quota and period when you say 50cpu?
>
> period is the default 100000 and quota is set to 5000000.
Thank you! I'll do some tests on my end as well.
[..snip..]
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index df8dc389af8e1..3e927b9b7eeb6 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9370,6 +9370,9 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
> if ((p->se.sched_delayed) && (env->migration_type != migrate_load))
> return 0;
>
> + if (throttled_hierarchy(task_group(p)->cfs_rq[env->dst_cpu]))
> + return 0;
> +
This makes sense instead of the full throttled_lb_pair(). You'll still
need to put it behind CONFIG_CGROUP_SCHED (or better yet
CONFIG_CFS_BANDWIDTH) since task_group() can return NULL if GROUP_SCHED
is not enabled.
> /*
> * We want to prioritize the migration of eligible tasks.
> * For ineligible tasks we soft-limit them and only allow
>
> base head' diff head(patch1-5)
> Time 82.55±4.82% 83.81±2.89% -1.5% 99.69±6.71%
>
> head': head + above diff
>
> I also tested netperf on this AMD system as well as hackbench and
> netperf on Intel EMR, no obvious performance difference observed
> after applying the above diff, i.e. base and head' performance is
> roughly the same.
>
> Does the above diff make sense? One thing I'm slightly concerned is,
> there may be one case when balancing a task to a throttled target
> cfs_rq makes sense: if the task holds some kernel resource and is
> running inside kernel, even its target cfs_rq is throttled, we still
> want this task to go there and finish its job in kernel mode sooner,
> this could help other resource waiters. But, this may not be a big deal
I think it is still an improvement over per-cfs_rq throttling from a
tail latency perspective.
> and in most of the time, balancing a task to a throttled cfs_rq doesn't
> look like a meaningful thing to do.Ack.
--
Thanks and Regards,
Prateek
Powered by blists - more mailing lists