[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <597384e3-6519-b10e-081b-30c3f89b6e3f@gmail.com>
Date: Mon, 13 Jan 2025 17:21:32 +0800
From: Hao Jia <jiahao.kernel@...il.com>
To: mingo@...hat.com, peterz@...radead.org, mingo@...nel.org,
juri.lelli@...hat.com, vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com
Cc: linux-kernel@...r.kernel.org, Hao Jia <jiahao1@...iang.com>
Subject: Re: [PATCH v2] sched/core: Prioritize migrating eligible tasks in
sched_balance_rq()
Friendly ping...
On 2024/12/23 17:14, Hao Jia wrote:
> From: Hao Jia <jiahao1@...iang.com>
>
> When the PLACE_LAG scheduling feature is enabled and
> dst_cfs_rq->nr_queued is greater than 1, if a task is
> ineligible (lag < 0) on the source cpu runqueue, it will
> also be ineligible when it is migrated to the destination
> cpu runqueue. Because we will keep the original equivalent
> lag of the task in place_entity(). So if the task was
> ineligible before, it will still be ineligible after
> migration.
>
> So in sched_balance_rq(), we prioritize migrating eligible
> tasks, and we soft-limit ineligible tasks, allowing them
> to migrate only when nr_balance_failed is non-zero to
> avoid load-balancing trying very hard to balance the load.
>
> Below are some benchmark test results. From my test results,
> this patch shows a slight improvement on hackbench.
>
> Benchmark
> =========
>
> All of the benchmarks are done inside a normal cpu cgroup in a
> clean environment with cpu turbo disabled, and test machine is:
>
> Single NUMA machine model is 13th Gen Intel(R) Core(TM)
> i7-13700, 12 Core/24 HT.
>
> Based on master b86545e02e8c.
>
> Results
> =======
>
> hackbench-process-pipes
> vanilla patched
> Amean 1 0.5837 ( 0.00%) 0.5733 ( 1.77%)
> Amean 4 1.4423 ( 0.00%) 1.4503 ( -0.55%)
> Amean 7 2.5147 ( 0.00%) 2.4773 ( 1.48%)
> Amean 12 3.9347 ( 0.00%) 3.8880 ( 1.19%)
> Amean 21 5.3943 ( 0.00%) 5.3873 ( 0.13%)
> Amean 30 6.7840 ( 0.00%) 6.6660 ( 1.74%)
> Amean 48 9.8313 ( 0.00%) 9.6100 ( 2.25%)
> Amean 79 15.4403 ( 0.00%) 14.9580 ( 3.12%)
> Amean 96 18.4970 ( 0.00%) 17.9533 ( 2.94%)
>
> hackbench-process-sockets
> vanilla patched
> Amean 1 0.6297 ( 0.00%) 0.6223 ( 1.16%)
> Amean 4 2.1517 ( 0.00%) 2.0887 ( 2.93%)
> Amean 7 3.6377 ( 0.00%) 3.5670 ( 1.94%)
> Amean 12 6.1277 ( 0.00%) 5.9290 ( 3.24%)
> Amean 21 10.0380 ( 0.00%) 9.7623 ( 2.75%)
> Amean 30 14.1517 ( 0.00%) 13.7513 ( 2.83%)
> Amean 48 24.7253 ( 0.00%) 24.2287 ( 2.01%)
> Amean 79 43.9523 ( 0.00%) 43.2330 ( 1.64%)
> Amean 96 54.5310 ( 0.00%) 53.7650 ( 1.40%)
>
> tbench4 Throughput
> vanilla patched
> Hmean 1 255.97 ( 0.00%) 275.01 ( 7.44%)
> Hmean 2 511.60 ( 0.00%) 544.27 ( 6.39%)
> Hmean 4 996.70 ( 0.00%) 1006.57 ( 0.99%)
> Hmean 8 1646.46 ( 0.00%) 1649.15 ( 0.16%)
> Hmean 16 2259.42 ( 0.00%) 2274.35 ( 0.66%)
> Hmean 32 4725.48 ( 0.00%) 4735.57 ( 0.21%)
> Hmean 64 4411.47 ( 0.00%) 4400.05 ( -0.26%)
> Hmean 96 4284.31 ( 0.00%) 4267.39 ( -0.39%)
>
> Signed-off-by: Hao Jia <jiahao1@...iang.com>
> Suggested-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
> Previous discussion link: https://lore.kernel.org/all/20241128084858.25220-1-jiahao.kernel@gmail.com
> Link to v1: https://lore.kernel.org/all/20241218080203.80556-1-jiahao.kernel@gmail.com
>
> v1 to v2:
> - Modify dst_cfs_rq->nr_running to dst_cfs_rq->nr_queued to
> resolve conflicts with commit 736c55a02c47 ("sched/fair:
> Rename cfs_rq.nr_running into nr_queued").
>
> kernel/sched/fair.c | 34 ++++++++++++++++++++++++++++++++++
> 1 file changed, 34 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5599b0c1ba9b..c884bf631e66 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9396,6 +9396,30 @@ static inline int migrate_degrades_locality(struct task_struct *p,
> }
> #endif
>
> +/*
> + * Check whether the task is ineligible on the destination cpu
> + *
> + * When the PLACE_LAG scheduling feature is enabled and
> + * dst_cfs_rq->nr_queued is greater than 1, if the task
> + * is ineligible, it will also be ineligible when
> + * it is migrated to the destination cpu.
> + */
> +static inline int task_is_ineligible_on_dst_cpu(struct task_struct *p, int dest_cpu)
> +{
> + struct cfs_rq *dst_cfs_rq;
> +
> +#ifdef CONFIG_FAIR_GROUP_SCHED
> + dst_cfs_rq = task_group(p)->cfs_rq[dest_cpu];
> +#else
> + dst_cfs_rq = &cpu_rq(dest_cpu)->cfs;
> +#endif
> + if (sched_feat(PLACE_LAG) && dst_cfs_rq->nr_queued &&
> + !entity_eligible(task_cfs_rq(p), &p->se))
> + return 1;
> +
> + return 0;
> +}
> +
> /*
> * can_migrate_task - may task p from runqueue rq be migrated to this_cpu?
> */
> @@ -9420,6 +9444,16 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
> if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu))
> return 0;
>
> + /*
> + * We want to prioritize the migration of eligible tasks.
> + * For ineligible tasks we soft-limit them and only allow
> + * them to migrate when nr_balance_failed is non-zero to
> + * avoid load-balancing trying very hard to balance the load.
> + */
> + if (!env->sd->nr_balance_failed &&
> + task_is_ineligible_on_dst_cpu(p, env->dst_cpu))
> + return 0;
> +
> /* Disregard percpu kthreads; they are where they need to be. */
> if (kthread_is_per_cpu(p))
> return 0;
Powered by blists - more mailing lists