lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKfTPtCZiD7+WxCvrOYoiT+xMRGj=emJEa+RJeOC88h6wu0pwg@mail.gmail.com>
Date: Tue, 6 Aug 2024 00:59:31 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Qais Yousef <qyousef@...alina.io>
Cc: Chen Yu <yu.c.chen@...el.com>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Juri Lelli <juri.lelli@...hat.com>, Hongyan Xia <hongyan.xia2@....com>, 
	Lukasz Luba <lukasz.luba@....com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] sched/pelt: Use rq_clock_task() for hw_pressure

On Tue, 6 Aug 2024 at 00:25, Qais Yousef <qyousef@...alina.io> wrote:
>
> On 08/05/24 12:56, Vincent Guittot wrote:
> > Sorry for the late reply on this
> >
> > On Mon, 29 Jul 2024 at 09:05, Chen Yu <yu.c.chen@...el.com> wrote:
> > >
> > > commit 97450eb90965 ("sched/pelt: Remove shift of thermal clock")
> > > removed the decay_shift for hw_pressure. This commit uses the
> > > sched_clock_task() in sched_tick() while it replaces the
> > > sched_clock_task() with rq_clock_pelt() in __update_blocked_others().
> >
> > Good catch, it should be sched_clock_task() everywhere
> >
> > > This could bring inconsistence. One possible scenario I can think of
> > > is in ___update_load_sum():
> > >
> > > u64 delta = now - sa->last_update_time
> > >
> > > 'now' could be calculated by rq_clock_pelt() from
> > > __update_blocked_others(), and last_update_time was calculated by
> > > rq_clock_task() previously from sched_tick(). Usually the former
> > > chases after the latter, it cause a very large 'delta' and brings
> > > unexpected behavior.
> > >
> > > Fix this by using rq_clock_task() inside update_hw_load_avg(),
> > > because:
> >
> > No, don't call the rq_clock_task() inside update_hw_load_avg(), keep
> > it outside. V2 was the correct solution
>
> Curious, why this is not correct?

Not correct in the sense that don't embed the clock in the pelt
function but keep it outside like others. The only one doing this is
update_irq_load_avg() which  is one my todo list for a while to be
aligned with others (just missing time)

>
> > Nack for this v3. Please come back on v2
> >
> > > 1. hw_pressure doesn't care about invariance.
> > > 2. avoid any inconsistence in the future.
> > >
> > > Fixes: 97450eb90965 ("sched/pelt: Remove shift of thermal clock")
> > > Suggested-by: Qais Yousef <qyousef@...alina.io>
> > > Reviewed-by: Hongyan Xia <hongyan.xia2@....com>
> > > Signed-off-by: Chen Yu <yu.c.chen@...el.com>
> > > ---
> > > v2>v3:
> > >   call rq_clock_task() inside update_hw_load_avg() to avoid any
> > >   inconsistence in the future. (Qais Yousef)
> > >   Added comments around update_hw_load_avg(). (Qais Yousef)
> > > v1->v2:
> > >   Added Hongyan's Reviewed-by tag.
> > >   Removed the Reported-by/Closes tags because they are not related
> > >   to this fix.(Hongyan Xia)
> > > ---
> > >  kernel/sched/core.c | 2 +-
> > >  kernel/sched/fair.c | 2 +-
> > >  kernel/sched/pelt.c | 5 +++--
> > >  kernel/sched/pelt.h | 6 +++---
> > >  4 files changed, 8 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index a9f655025607..011d447e065f 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -5462,7 +5462,7 @@ void sched_tick(void)
> > >
> > >         update_rq_clock(rq);
> > >         hw_pressure = arch_scale_hw_pressure(cpu_of(rq));
> > > -       update_hw_load_avg(rq_clock_task(rq), rq, hw_pressure);
> > > +       update_hw_load_avg(rq, hw_pressure);
> > >         curr->sched_class->task_tick(rq, curr, 0);
> > >         if (sched_feat(LATENCY_WARN))
> > >                 resched_latency = cpu_resched_latency(rq);
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index 9057584ec06d..193ac2c702d9 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -9362,7 +9362,7 @@ static bool __update_blocked_others(struct rq *rq, bool *done)
> > >
> > >         decayed = update_rt_rq_load_avg(now, rq, curr_class == &rt_sched_class) |
> > >                   update_dl_rq_load_avg(now, rq, curr_class == &dl_sched_class) |
> > > -                 update_hw_load_avg(now, rq, hw_pressure) |
> > > +                 update_hw_load_avg(rq, hw_pressure) |
> > >                   update_irq_load_avg(rq, 0);
> > >
> > >         if (others_have_blocked(rq))
> > > diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
> > > index fa52906a4478..06a5fa85327c 100644
> > > --- a/kernel/sched/pelt.c
> > > +++ b/kernel/sched/pelt.c
> > > @@ -400,9 +400,10 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
> > >   *                     capped capacity a cpu due to a HW event.
> > >   */
> > >
> > > -int update_hw_load_avg(u64 now, struct rq *rq, u64 capacity)
> > > +int update_hw_load_avg(struct rq *rq, u64 capacity)
> > >  {
> > > -       if (___update_load_sum(now, &rq->avg_hw,
> > > +       /* hw_pressure doesn't care about invariance */
> > > +       if (___update_load_sum(rq_clock_task(rq), &rq->avg_hw,
> > >                                capacity,
> > >                                capacity,
> > >                                capacity)) {
> > > diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
> > > index 2150062949d4..261172c29f8f 100644
> > > --- a/kernel/sched/pelt.h
> > > +++ b/kernel/sched/pelt.h
> > > @@ -8,7 +8,7 @@ int update_rt_rq_load_avg(u64 now, struct rq *rq, int running);
> > >  int update_dl_rq_load_avg(u64 now, struct rq *rq, int running);
> > >
> > >  #ifdef CONFIG_SCHED_HW_PRESSURE
> > > -int update_hw_load_avg(u64 now, struct rq *rq, u64 capacity);
> > > +int update_hw_load_avg(struct rq *rq, u64 capacity);
> > >
> > >  static inline u64 hw_load_avg(struct rq *rq)
> > >  {
> > > @@ -16,7 +16,7 @@ static inline u64 hw_load_avg(struct rq *rq)
> > >  }
> > >  #else
> > >  static inline int
> > > -update_hw_load_avg(u64 now, struct rq *rq, u64 capacity)
> > > +update_hw_load_avg(struct rq *rq, u64 capacity)
> > >  {
> > >         return 0;
> > >  }
> > > @@ -202,7 +202,7 @@ update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
> > >  }
> > >
> > >  static inline int
> > > -update_hw_load_avg(u64 now, struct rq *rq, u64 capacity)
> > > +update_hw_load_avg(struct rq *rq, u64 capacity)
> > >  {
> > >         return 0;
> > >  }
> > > --
> > > 2.25.1
> > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ