[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230717111053.GI4253@hirez.programming.kicks-ass.net>
Date: Mon, 17 Jul 2023 13:10:53 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>
Cc: Tim Chen <tim.c.chen@...ux.intel.com>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Ricardo Neri <ricardo.neri@...el.com>,
"Ravi V . Shankar" <ravi.v.shankar@...el.com>,
Ben Segall <bsegall@...gle.com>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Len Brown <len.brown@...el.com>, Mel Gorman <mgorman@...e.de>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Steven Rostedt <rostedt@...dmis.org>,
Valentin Schneider <vschneid@...hat.com>,
Ionela Voinescu <ionela.voinescu@....com>, x86@...nel.org,
linux-kernel@...r.kernel.org,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
naveen.n.rao@...ux.vnet.ibm.com,
Yicong Yang <yangyicong@...ilicon.com>,
Barry Song <v-songbaohua@...o.com>,
Chen Yu <yu.c.chen@...el.com>, Hillf Danton <hdanton@...a.com>
Subject: Re: [Patch v3 1/6] sched/fair: Determine active load balance for SMT
sched groups
On Mon, Jul 17, 2023 at 01:06:59AM +0530, Shrikanth Hegde wrote:
>
>
> On 7/15/23 4:35 AM, Tim Chen wrote:
> > On Fri, 2023-07-14 at 18:36 +0530, Shrikanth Hegde wrote:
> >
> >>
> >>
> >> If we consider symmetric platforms which have SMT4 such as power10.
> >> we have a topology like below. multiple such MC will form DIE(PKG)
> >>
> >>
> >> [0 2 4 6][1 3 5 7][8 10 12 14][9 11 13 15]
> >> [--SMT--][--SMT--][----SMT---][---SMT----]
> >> [--sg1--][--sg1--][---sg1----][---sg1----]
> >> [--------------MC------------------------]
> >>
> >> In case of SMT4, if there is any group which has 2 or more tasks, that
> >> group will be marked as group_smt_balance. previously, if that group had 2
> >> or 3 tasks, it would have been marked as group_has_spare. Since all the groups have
> >> SMT that means behavior would be same fully busy right? That can cause some
> >> corner cases. No?
> >
> > You raised a good point. I was looking from SMT2
> > perspective so group_smt_balance implies group_fully_busy.
> > That is no longer true for SMT4.
> >
> > I am thinking of the following fix on the current patch
> > to take care of SMT4. Do you think this addresses
>
> Thanks Tim for taking a look at it again.
>
> Yes. I think this would address some of the corner cases.
> Any SMT4 group having 2,3,4 will have smt_balance as the group type, and busiest one
> is the one which has least number of idle cpu's. (same conditions as group_has_spare)
>
>
>
>
> > concerns from you and Tobias?
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 294a662c9410..3fc8d3a3bd22 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9588,6 +9588,17 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> > break;
> >
> > case group_smt_balance:
> > + /* no idle cpus on both groups handled by group_fully_busy below */
> > + if (sgs->idle_cpus != 0 || busiest->idle_cpus != 0) {
> > + if (sgs->idle_cpus > busiest->idle_cpus)
> > + return false;
> > + if (sgs->idle_cpus < busiest->idle_cpus)
> > + return true;
> > + if (sgs->sum_nr_running <= busiest_sum_nr_running)
> > + return false;
> > + else
> > + return true;
> > + }
> >
> >
> > I will be on vacation next three weeks so my response will be slow.
> >
> > Tim
> >
> >>
>
> Small suggestion to above code to avoid compiler warning of switch case falling
> through and else case can be removed, since update_sd_pick_busiest by default returns true.
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index e5a75c76bcaa..ae364ac6f22e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9728,9 +9728,9 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> return true;
> if (sgs->sum_nr_running <= busiest->sum_nr_running)
> return false;
> - else
> - return true;
> }
> + break;
> +
> case group_fully_busy:
> /*
> * Select the fully busy group with highest avg_load. In
>
>
Can someone please send a full patch for this? I've already queued Tim's
patches in tip/sched/core (tip-bot seems to have died somewhere last
week, it's being worked on).
Powered by blists - more mailing lists