lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 8 Jul 2022 14:35:11 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Dietmar Eggemann <dietmar.eggemann@....com>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com,
        linux-kernel@...r.kernel.org, david.chen@...anix.com,
        zhangqiao22@...wei.com
Subject: Re: [PATCH] sched/fair: fix case with reduced capacity CPU

Le vendredi 08 juil. 2022 à 12:10:40 (+0200), Dietmar Eggemann a écrit :
> On 08/07/2022 09:17, Vincent Guittot wrote:
> > On Thu, 7 Jul 2022 at 18:43, Dietmar Eggemann <dietmar.eggemann@....com> wrote:
> >>
> >> On 02/07/2022 06:52, Vincent Guittot wrote:
> 
> [...]
> 
> >>> The rework of the load balance has filterd the case when the CPU is
> 
> s/filterd/filtered
> 
> >>> classified to be fully busy but its capacity is reduced.
> >>>
> >>> Check if CPU's capacity is reduced while gathering load balance statistics
> >>> and classify it group_misfit_task instead of group_fully_busy so we can
> 
> enum group_type {
> 
>    ...
>    /*
>     * SD_ASYM_CPUCAPACITY only: One task doesn't fit with CPU's capacity
>     * and must be migrated to a more powerful CPU.
>     */
>    group_misfit_task
>    ...
> 
> This `SD_ASYM_CPUCAPACITY only:` should be removed now.

Yes

> 
> [...]
> 
> >>> @@ -8798,6 +8798,19 @@ sched_asym(struct lb_env *env, struct sd_lb_stats *sds,  struct sg_lb_stats *sgs
> >>>       return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
> >>>  }
> >>>
> >>> +static inline bool
> >>> +sched_reduced_capacity(struct rq *rq, struct sched_domain *sd)
> 
> minor: Why not `static inline int check_reduced_capacity()` ? All
> similar functions like check_cpu_capacity(), check_cpu_capacity() follow
> this approach.

Mainly because it aims to return true or false.
IIRC check_cpu_capacity has replaced fix_small_capacity which was not a bool but the
number of task a group could handle but kept the int

> 
> [...]
> 
> >>> @@ -8851,11 +8865,17 @@ static inline void update_sg_lb_stats(struct lb_env *env,
> >>>               if (local_group)
> >>>                       continue;
> >>>
> >>> -             /* Check for a misfit task on the cpu */
> >>> -             if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
> >>> -                 sgs->group_misfit_task_load < rq->misfit_task_load) {
> >>> -                     sgs->group_misfit_task_load = rq->misfit_task_load;
> >>> -                     *sg_status |= SG_OVERLOAD;
> >>> +             if (env->sd->flags & SD_ASYM_CPUCAPACITY) {
> >>> +                     /* Check for a misfit task on the cpu */
> >>> +                     if (sgs->group_misfit_task_load < rq->misfit_task_load) {
> >>> +                             sgs->group_misfit_task_load = rq->misfit_task_load;
> >>> +                             *sg_status |= SG_OVERLOAD;
> >>> +                     }
> >>> +             } else if ((env->idle != CPU_NOT_IDLE) &&
> >>> +                        sched_reduced_capacity(rq, env->sd) &&
> >>> +                        (sgs->group_misfit_task_load < load)) {
> >>> +                     /* Check for a task running on a CPU with reduced capacity */
> >>> +                     sgs->group_misfit_task_load = load;
> >>>               }
> 
> Minor:
> 
> This now has if(A)
>               if(B)
> 	     else if(C && B')
> 
> little bit harder to read.
> 

yeah I started with the below but then optimized it. I can come back to the
version below if it's easier to read

		if (env->sd->flags & SD_ASYM_CPUCAPACITY) {
			/* Check for a misfit task on the cpu */
			if (sgs->group_misfit_task_load < rq->misfit_task_load) {
				sgs->group_misfit_task_load = rq->misfit_task_load;
				*sg_status |= SG_OVERLOAD;
			}
		} else if ((env->idle != CPU_NOT_IDLE) &&
			   sched_reduced_capacity(rq, env->sd)) {
			/* Check for a task running on a CPU with reduced capacity */
			if (sgs->group_misfit_task_load < load))
				sgs->group_misfit_task_load = load;
		}


> [...]
> 
> >> I'm wondering why you've chosen that hybrid approach `group_misfit_task
> >> -> migrate_load` and not `group_misfit_task -> migrate_misfit`.
> > 
> > because, it means enabling the tracking of misfit task on rq at each
> > task enqueue/dequeue/tick ...  Then mistfit for heterogeneous platform
> > checks max_cpu_capacity what we don't care and will trigger unwanted
> > misfit migration for smp
> 
> Agreed, rq->misfit_task_load can't be used here.
> 
> >> It looks like this `rq->cfs.h_nr_running = 1` case almost (since we
> >> check `busiest->nr_running > 1`) always ends up in the load_balance()
> >> `if (!ld_moved)` condition and need_active_balance() can return 1 in
> >> case `if ((env->idle != CPU_NOT_IDLE) && ...` condition. This leads to
> >> active load_balance and this
> >>
> >> IMHO, the same you can achieve when you would stay with
> >> `group_misfit_task -> migrate_misfit`.
> >>
> >> I think cpu_load(rq) can be used instead of `rq->misfit_task_load` in
> >> the migrate_misfit case of find_busiest_queue() too.
> > 
> > I don't think because you can have a higher cpu_load() but not being misfit
> 
> You're right, I forgot about this. Essentially we would need extra state
> (e.g. in lb_env) to save which CPU in the busiest group has the misfit.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ