lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZaqZ8wXzNvqUH8Jn@vingu-book>
Date: Fri, 19 Jan 2024 16:49:07 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Konstantin Khorenko <khorenko@...tuozzo.com>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Daniel Bristot de Oliveira <bristot@...hat.com>,
	Valentin Schneider <vschneid@...hat.com>,
	Alexander Atanasov <alexander.atanasov@...tuozzo.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 RESEND] sched/fair: Do not scan non-movable tasks
 several times

Le lundi 15 janv. 2024 à 13:50:52 (+0300), Konstantin Khorenko a écrit :
> If busiest rq is small, nr_running < SCHED_NR_MIGRATE_BREAK and all
> tasks are not movable, detach_tasks() should not iterate more than tasks
> available in the busiest rq.
> 
> Before commit: b0defa7ae03e ("sched/fair: Make sure to try to detach at
> least one movable task"), the (env->loop > env->loop_max) condition
> prevented us from scanning non-movable tasks more than rq size times,
> but after we start checking the LBF_ALL_PINNED flag, the "all tasks are
> not movable" case is under threat.
> 
> Note: in case all tasks in the rq could not be moved in detach_tasks()
> we always increase loop_break by SCHED_NR_MIGRATE_BREAK, so we can step
> over loop_max, but i think it's a rare case and does not worth adding
> here extra check for rq->nr_running overlimit.


In this case why not doing the below ? Close to your 1st version 

---
 kernel/sched/fair.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fce22b4462bb..1dae6cdf8561 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11344,6 +11344,13 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 		env.loop_max  = min(sysctl_sched_nr_migrate, busiest->nr_running);

 more_balance:
+		/*
+		 * If busiest rq is small, nr_running < SCHED_NR_MIGRATE_BREAK
+		 * and all tasks are not movable, detach_tasks() should not
+		 * iterate more than tasks available in rq.
+		 */
+		env.loop_break = min(env.loop_break, busiest->nr_running);
+
 		rq_lock_irqsave(busiest, &rf);
 		update_rq_clock(busiest);

--
2.34.1


> 
> Fixes: b0defa7ae03e ("sched/fair: Make sure to try to detach at least
> one movable task")
> 
> Signed-off-by: Konstantin Khorenko <khorenko@...tuozzo.com>
> ---
>  kernel/sched/fair.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 533547e3c90a..920fb16e6e2f 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11277,7 +11277,6 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>  		.dst_rq		= this_rq,
>  		.dst_grpmask    = group_balance_mask(sd->groups),
>  		.idle		= idle,
> -		.loop_break	= SCHED_NR_MIGRATE_BREAK,
>  		.cpus		= cpus,
>  		.fbq_type	= all,
>  		.tasks		= LIST_HEAD_INIT(env.tasks),
> @@ -11324,6 +11323,14 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>  		 */
>  		env.loop_max  = min(sysctl_sched_nr_migrate, busiest->nr_running);
>  
> +more_balance_reset_break:
> +		/*
> +		 * If busiest rq is small, nr_running < SCHED_NR_MIGRATE_BREAK
> +		 * and all tasks are not movable, detach_tasks() should not
> +		 * iterate more than tasks available in rq.
> +		 */
> +		env.loop_break = min(SCHED_NR_MIGRATE_BREAK, busiest->nr_running);
> +
>  more_balance:
>  		rq_lock_irqsave(busiest, &rf);
>  		update_rq_clock(busiest);
> @@ -11386,13 +11393,12 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>  			env.dst_cpu	 = env.new_dst_cpu;
>  			env.flags	&= ~LBF_DST_PINNED;
>  			env.loop	 = 0;
> -			env.loop_break	 = SCHED_NR_MIGRATE_BREAK;
>  
>  			/*
>  			 * Go back to "more_balance" rather than "redo" since we
>  			 * need to continue with same src_cpu.
>  			 */
> -			goto more_balance;
> +			goto more_balance_reset_break;
>  		}
>  
>  		/*
> @@ -11418,7 +11424,6 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>  			 */
>  			if (!cpumask_subset(cpus, env.dst_grpmask)) {
>  				env.loop = 0;
> -				env.loop_break = SCHED_NR_MIGRATE_BREAK;
>  				goto redo;
>  			}
>  			goto out_all_pinned;
> -- 
> 2.39.3
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ