lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtCSGcx0_0b7PTWPNi9LnGqCCTpt4zswOBumVgr7CWAJbQ@mail.gmail.com>
Date:   Thu, 18 Jun 2020 14:35:36 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Xing Zhengjun <zhengjun.xing@...ux.intel.com>
Cc:     Hillf Danton <hdanton@...a.com>,
        kernel test robot <rong.a.chen@...el.com>,
        Ingo Molnar <mingo@...nel.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Juri Lelli <juri.lelli@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Valentin Schneider <valentin.schneider@....com>,
        Phil Auld <pauld@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [LKP] [sched/fair] 070f5e860e: reaim.jobs_per_min -10.5% regression

On Thu, 18 Jun 2020 at 04:45, Xing Zhengjun
<zhengjun.xing@...ux.intel.com> wrote:
>
>
>

> >>
> >> This bench forks a new thread for each and every new step. But a newly forked
> >> threads start with a load_avg and a runnable_avg set to max whereas the threads
> >> are running shortly before exiting. This makes the CPU to be set overloaded in
> >> some case whereas it isn't.
> >>
> >> Could you try the patch below ?
> >> It fixes the problem on my setup (I have finally been able to reproduce the problem)
> >>
> >> ---
> >>   kernel/sched/fair.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index 0aeffff62807..b33a4a9e1491 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -807,7 +807,7 @@ void post_init_entity_util_avg(struct task_struct *p)
> >>              }
> >>      }
> >>
> >> -    sa->runnable_avg = cpu_scale;
> >> +    sa->runnable_avg = sa->util_avg;
> >>
> >>      if (p->sched_class != &fair_sched_class) {
> >>              /*
> >> --
> >> 2.17.1
> >>
> >
> > The patch above tries to move back to the group in the same classification as
> > before but this could harm other benchmarks.
> >
> > There is another way to fix this by easing the migration of task in the case
> > of migrate_util imbalance.
> >
> > Could you also try the patch below instead of the one above ?
> >
> > ---
> >   kernel/sched/fair.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 0aeffff62807..fcaf66c4d086 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7753,7 +7753,8 @@ static int detach_tasks(struct lb_env *env)
> >               case migrate_util:
> >                       util = task_util_est(p);
> >
> > -                     if (util > env->imbalance)
> > +                     if (util/2 > env->imbalance &&
> > +                         env->sd->nr_balance_failed <= env->sd->cache_nice_tries)
> >                               goto next;
> >
> >                       env->imbalance -= util;
> > --
> > 2.17.1
> >
> >
>
> I apply the patch based on v5.7, the test result is as the following:
>
> =========================================================================================
> tbox_group/testcase/rootfs/kconfig/compiler/runtime/nr_task/debug-setup/test/cpufreq_governor/ucode:
>
> lkp-ivb-d04/reaim/debian-x86_64-20191114.cgz/x86_64-rhel-7.6/gcc-7/300s/100%/test/five_sec/performance/0x21
>
> commit:
>    9f68395333ad7f5bfe2f83473fed363d4229f11c
>    070f5e860ee2bf588c99ef7b4c202451faa48236
>    v5.7
>    69c81543653bf5f2c7105086502889fa019c15cb  (the test patch)
>
> 9f68395333ad7f5b 070f5e860ee2bf588c99ef7b4c2                        v5.7
> 69c81543653bf5f2c7105086502
> ---------------- --------------------------- ---------------------------
> ---------------------------
>           %stddev     %change         %stddev     %change
> %stddev     %change         %stddev
>               \          |                \          |                \
>          |                \
>        0.69           -10.3%       0.62            -9.1%       0.62
>        -7.6%       0.63        reaim.child_systime
>        0.62            -1.0%       0.61            +0.5%       0.62
>        +1.9%       0.63        reaim.child_utime
>       66870           -10.0%      60187            -7.6%      61787
>        -5.9%      62947        reaim.jobs_per_min

There is an improvement but not at the same level as on my setup.
I'm not sure with patch you tested here. Is it the last one that
modify detach_tasks() or the previous one that modify
post_init_entity_util_avg() ?

Could you also try the other one ? Both patches were improving results
on y setup but the behavior doesn't seem to be the same on your setup.


>       16717           -10.0%      15046            -7.6%      15446
>        -5.9%      15736        reaim.jobs_per_min_child
>       97.84            -1.1%      96.75            -0.4%      97.43
>        -0.4%      97.47        reaim.jti
>       72000           -10.8%      64216            -8.3%      66000
>        -5.7%      67885        reaim.max_jobs_per_min
>        0.36           +10.6%       0.40            +7.8%       0.39
>        +6.0%       0.38        reaim.parent_time
>        1.58 ±  2%     +71.0%       2.70 ±  2%     +26.9%       2.01 ±
> 2%     +23.6%       1.95 ±  3%  reaim.std_dev_percent
>        0.00 ±  5%    +110.4%       0.01 ±  3%     +48.8%       0.01 ±
> 7%     +43.2%       0.01 ±  5%  reaim.std_dev_time
>       50800            -2.4%      49600            -1.6%      50000
>        -0.8%      50400        reaim.workload
>
>
> >>
> >>
...
> >>>>>
> >>>>> --
> >>>>> Zhengjun Xing
> >>>
> >>> --
> >>> Zhengjun Xing
>
> --
> Zhengjun Xing

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ