lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 25 Jan 2021 21:35:56 +0000
From:   "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>
To:     Valentin Schneider <valentin.schneider@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC:     "mingo@...nel.org" <mingo@...nel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
        "dietmar.eggemann@....com" <dietmar.eggemann@....com>,
        "morten.rasmussen@....com" <morten.rasmussen@....com>,
        "mgorman@...e.de" <mgorman@...e.de>
Subject: RE: [PATCH 1/1] sched/topology: Make sched_init_numa() use a set for
 the deduplicating sort



> -----Original Message-----
> From: Valentin Schneider [mailto:valentin.schneider@....com]
> Sent: Tuesday, January 26, 2021 5:46 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@...ilicon.com>;
> linux-kernel@...r.kernel.org
> Cc: mingo@...nel.org; peterz@...radead.org; vincent.guittot@...aro.org;
> dietmar.eggemann@....com; morten.rasmussen@....com; mgorman@...e.de
> Subject: RE: [PATCH 1/1] sched/topology: Make sched_init_numa() use a set for
> the deduplicating sort
> 
> On 25/01/21 09:26, Valentin Schneider wrote:
> > On 25/01/21 02:23, Song Bao Hua (Barry Song) wrote:
> >
> >> with the below topology:
> >> qemu-system-aarch64 -M virt -nographic \
> >>  -smp cpus=8 \
> >>  -numa node,cpus=0-1,nodeid=0 \
> >>  -numa node,cpus=2-3,nodeid=1 \
> >>  -numa node,cpus=4-5,nodeid=2 \
> >>  -numa node,cpus=6-7,nodeid=3 \
> >>  -numa dist,src=0,dst=1,val=12 \
> >>  -numa dist,src=0,dst=2,val=20 \
> >>  -numa dist,src=0,dst=3,val=22 \
> >>  -numa dist,src=1,dst=2,val=22 \
> >>  -numa dist,src=2,dst=3,val=12 \
> >>  -numa dist,src=1,dst=3,val=24 \
> >>
> >>
> >> The panic address is *1294:
> >>
> >>                         if (sdd->sd) {
> >>     1280:       f9400e61        ldr     x1, [x19, #24]
> >>     1284:       b4000201        cbz     x1, 12c4 <build_sched_domains+0x414>
> >>                                 sd = *per_cpu_ptr(sdd->sd, j);
> >>     1288:       93407eb7        sxtw    x23, w21
> >>     128c:       aa0103e0        mov     x0, x1
> >>     1290:       f8777ac2        ldr     x2, [x22, x23, lsl #3]
> >>     *1294:       f8626800        ldr     x0, [x0, x2]
> >>                                 if (sd && (sd->flags & SD_OVERLAP))
> >>     1298:       b4000120        cbz     x0, 12bc <build_sched_domains+0x40c>
> >>     129c:       b9403803        ldr     w3, [x0, #56]
> >>     12a0:       365800e3        tbz     w3, #11, 12bc
> >> <build_sched_domains+0x40c>
> >>                                         free_sched_groups(sd->groups, 0);
> >>     12a4:       f9400800        ldr     x0, [x0, #16]
> >>         if (!sg)
> >>
> >
> > Thanks for giving it a shot, let me run that with your topology and see
> > where I end up.
> >
> 
> I can't seem to reproduce this - your topology is actually among the ones
> I tested this against.
> 
> Applying this patch obviously doesn't get rid of the group span issue, but
> it does remove this warning:
> 
> [    0.354806] ERROR: Node-0 not representative
> [    0.354806]
> [    0.355223]   10 12 20 22
> [    0.355475]   12 10 22 24
> [    0.355648]   20 22 10 12
> [    0.355814]   22 24 12 10
> 
> I'm running this based on tip/sched/core:
> 
>   65bcf072e20e ("sched: Use task_current() instead of 'rq->curr == p'")
I was using 5.11-rc1. One thing I'd like to mention is that:

For the below topology:
+-------+          +-----+
| node1 |  20      |node2|
|       +----------+     |
+---+---+          +-----+
    |                  |12
12  |                  |
+---+---+          +---+-+
|       |          |node3|
| node0 |          |     |
+-------+          +-----+

with node0-node2 as 22, node0-node3 as 24, node1-node3 as 22.

I will get the below sched_domains_numa_distance[]:
10, 12, 22, 24
As you can see there is *no* 20. So the node1 and node2 will
only get two-level numa sched_domain:


But for the below topology:
+-------+          +-----+
| node0 |  20      |node2|
|       +----------+     |
+---+---+          +-----+
    |                  |12
12  |                  |
+---+---+          +---+-+
|       |          |node3|
| node1 |          |     |
+-------+          +-----+

with node1-node2 as 22, node1-node3 as 24,node0-node3 as 22.

I will get the below sched_domains_numa_distance[]:
10, 12, 20, 22, 24

What I have seen is the performance will be better if we
drop the 20 as we will get a sched_domain hierarchy with less
levels, and two intermediate nodes won't have the group span
issue.

Thanks
Barry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ