lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200907104406.GD3179@techsingularity.net>
Date:   Mon, 7 Sep 2020 11:44:06 +0100
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>
Cc:     Mel Gorman <mgorman@...e.de>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "juri.lelli@...hat.com" <juri.lelli@...hat.com>,
        "vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
        "dietmar.eggemann@....com" <dietmar.eggemann@....com>,
        "bsegall@...gle.com" <bsegall@...gle.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linuxarm <linuxarm@...wei.com>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Valentin Schneider <valentin.schneider@....com>,
        Phil Auld <pauld@...hat.com>, Hillf Danton <hdanton@...a.com>,
        Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH] sched/fair: use dst group while checking imbalance for
 NUMA balancer

On Mon, Sep 07, 2020 at 09:44:03AM +0000, Song Bao Hua (Barry Song) wrote:
> 
> 
> > -----Original Message-----
> > From: Mel Gorman [mailto:mgorman@...e.de]
> > Sent: Monday, September 7, 2020 9:27 PM
> > To: Song Bao Hua (Barry Song) <song.bao.hua@...ilicon.com>
> > Cc: mingo@...hat.com; peterz@...radead.org; juri.lelli@...hat.com;
> > vincent.guittot@...aro.org; dietmar.eggemann@....com;
> > bsegall@...gle.com; linux-kernel@...r.kernel.org; Linuxarm
> > <linuxarm@...wei.com>; Mel Gorman <mgorman@...hsingularity.net>;
> > Peter Zijlstra <a.p.zijlstra@...llo.nl>; Valentin Schneider
> > <valentin.schneider@....com>; Phil Auld <pauld@...hat.com>; Hillf Danton
> > <hdanton@...a.com>; Ingo Molnar <mingo@...nel.org>
> > Subject: Re: [PATCH] sched/fair: use dst group while checking imbalance for
> > NUMA balancer
> > 
> > On Mon, Sep 07, 2020 at 07:27:08PM +1200, Barry Song wrote:
> > > Something is wrong. In find_busiest_group(), we are checking if src has
> > > higher load, however, in task_numa_find_cpu(), we are checking if dst
> > > will have higher load after balancing. It seems it is not sensible to
> > > check src.
> > > It maybe cause wrong imbalance value, for example, if
> > > dst_running = env->dst_stats.nr_running + 1 results in 3 or above, and
> > > src_running = env->src_stats.nr_running - 1 results in 1;
> > > The current code is thinking imbalance as 0 since src_running is smaller
> > > than 2.
> > > This is inconsistent with load balancer.
> > >
> > 
> > It checks the conditions if the move was to happen. Have you evaluated
> > this for a NUMA balancing load and confirmed it a) balances properly and
> > b) does not increase the scan rate trying to "fix" the problem?
> 
> I think the original code was trying to check if the numa migration
> would lead to new imbalance in load balancer. In case src is A, dst is B, and
> both of them have nr_running as 2. A moves one task to B, then A
> will have 1, B will have 3. In load balancer, A will try to pull task
> from B since B's nr_running is larger than min_imbalance. But the code
> is saying imbalance=0 by finding A's nr_running is smaller than
> min_imbalance.
> 
> Will share more test data if you need.
> 

Include the test description, data and details of the system you used to
evaluate the patch. I ask because the load/numa reconcilation took a long
time to cover all the corner cases and it's very easy to reintroduce major
regressions. At least one of those corner cases was trying to balance
in the wrong direction because in some cases NUMA balancing will try to
allow a small imbalance if it makes sense from a locality point of view.
Another corner case was if that small imbalance is too large or done at
the wrong time, it regresses overall even though the locality is good
because of memory bandwidth limitations. This is obviously far from ideal
but it does mean that it's an area that needs data backing up the changes.

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ