lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161017090903.GA11962@linaro.org>
Date:   Mon, 17 Oct 2016 11:09:03 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Joseph Salisbury <joseph.salisbury@...onical.com>
Cc:     Dietmar Eggemann <dietmar.eggemann@....com>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Mike Galbraith <efault@....de>, omer.akram@...onical.com
Subject: Re: [v4.8-rc1 Regression] sched/fair: Apply more PELT fixes

Le Friday 14 Oct 2016 à 12:04:02 (-0400), Joseph Salisbury a écrit :
> On 10/14/2016 11:18 AM, Vincent Guittot wrote:
> > Le Friday 14 Oct 2016 à 14:10:07 (+0100), Dietmar Eggemann a écrit :
> >> On 14/10/16 09:24, Vincent Guittot wrote:
> >>> On 13 October 2016 at 23:34, Vincent Guittot <vincent.guittot@...aro.org> wrote:
> >>>> On 13 October 2016 at 20:49, Dietmar Eggemann <dietmar.eggemann@....com> wrote:
> >>>>> On 13/10/16 17:48, Vincent Guittot wrote:
> >>>>>> On 13 October 2016 at 17:52, Joseph Salisbury
> >>>>>> <joseph.salisbury@...onical.com> wrote:
> >>>>>>> On 10/13/2016 06:58 AM, Vincent Guittot wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> On 12 October 2016 at 18:21, Joseph Salisbury
> >>>>>>>> <joseph.salisbury@...onical.com> wrote:
> >>>>>>>>> On 10/12/2016 08:20 AM, Vincent Guittot wrote:
> >>>>>>>>>> On 8 October 2016 at 13:49, Mike Galbraith <efault@....de> wrote:
> >>>>>>>>>>> On Sat, 2016-10-08 at 13:37 +0200, Vincent Guittot wrote:
> >>>>>>>>>>>> On 8 October 2016 at 10:39, Ingo Molnar <mingo@...nel.org> wrote:
> >>>>>>>>>>>>> * Peter Zijlstra <peterz@...radead.org> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Fri, Oct 07, 2016 at 03:38:23PM -0400, Joseph Salisbury wrote:
> >> [...]
> >>
> >>>>> When I create a tg_root/tg_x/tg_y_1 and a tg_root/tg_x/tg_y_2 group, the tg_x->load_avg
> >>>>> becomes > 6*1024 before any tasks ran in it.
> >>>> This is normal as se->avg.load_avg is initialized to
> >>>> scale_load_down(se->load.weight) and this se->avg.load_avg will be
> >>>> added to tg_x[cpu]->cfs_rq->avg.load_avg when attached to the cfs_rq
> >> Yeah, you right, even when I've created 50 second level groups,
> >> tg_x->load_avg is ~6800.
> >>
> >> Could it have something to do with the fact that .se->load.weight = 2
> >> for all these task groups? on a 64bit system?
> > I don't think so, the problem really comes from tg->load_avg = 381697
> > but sum of cfs_rq[cpu]->tg_load_avg_contrib = 1013 which is << tg->load_avg
> > and cfs_rq[cpu]->tg_load_avg_contrib == cfs_rq[cpu]->avg.load_avg so we can't 
> > expect any negative delta to remove this large value
> >
> >> In case we call  __update_load_avg(..., se->on_rq *
> >> scale_load_down(se->load.weight), ...) we pass a weight argument of 0
> >> for these se's.
> >>
> >> Does not happen with:
> >>
> >> -       if (shares < MIN_SHARES)
> >> -               shares = MIN_SHARES;
> >> +       if (shares < scale_load(MIN_SHARES))
> >> +               shares = scale_load(MIN_SHARES);
> >>
> >> in  calc_cfs_shares().
> >>
> >> [...]
> >>
> >>
> Adding Omer to CC list, as he is able to reproduce this bug.

Could you try the patch below on top of the faulty kernel ?

---
 kernel/sched/fair.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8b03fb5..8926685 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2902,7 +2902,8 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
  */
 static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force)
 {
-	long delta = cfs_rq->avg.load_avg - cfs_rq->tg_load_avg_contrib;
+	unsigned long load_avg = READ_ONCE(cfs_rq->avg.load_avg);
+	long delta = load_avg - cfs_rq->tg_load_avg_contrib;
 
 	/*
 	 * No need to update load_avg for root_task_group as it is not used.
@@ -2912,7 +2913,7 @@ static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force)
 
 	if (force || abs(delta) > cfs_rq->tg_load_avg_contrib / 64) {
 		atomic_long_add(delta, &cfs_rq->tg->load_avg);
-		cfs_rq->tg_load_avg_contrib = cfs_rq->avg.load_avg;
+		cfs_rq->tg_load_avg_contrib = load_avg;
 	}
 }
 
-- 
2.7.4


> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ