[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <X/xL8d8FXVJHkQUj@hirez.programming.kicks-ass.net>
Date: Mon, 11 Jan 2021 14:00:33 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Steve Wahl <steve.wahl@....com>
Cc: rja_direct@...ups.int.hpe.com, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org,
"Liang, Kan" <kan.liang@...el.com>
Subject: Re: [PATCH 2/2] perf/x86/intel/uncore: With > 8 nodes, get pci bus
die id from NUMA info
On Fri, Jan 08, 2021 at 09:35:49AM -0600, Steve Wahl wrote:
> + /*
> + * The nodeid and idmap registers only contain enough
> + * information to handle 8 nodes. On systems with more
> + * than 8 nodes, we need to rely on NUMA information,
> + * filled in from BIOS supplied information, to determine
> + * the topology.
> + */
Egads.. do we realy have to trust BIOS data? BIOS crud tends to be
bonghits qualitee :/
> + if (nr_node_ids <= 8) {
> + /* get the Node ID of the local register */
> + err = pci_read_config_dword(ubox_dev, nodeid_loc, &config);
> + if (err)
> + break;
> + nodeid = config & NODE_ID_MASK;
> + /* get the Node ID mapping */
> + err = pci_read_config_dword(ubox_dev, idmap_loc, &config);
> + if (err)
> + break;
>
> + segment = pci_domain_nr(ubox_dev->bus);
> + raw_spin_lock(&pci2phy_map_lock);
> + map = __find_pci2phy_map(segment);
> + if (!map) {
> + raw_spin_unlock(&pci2phy_map_lock);
> + err = -ENOMEM;
> + break;
> + }
> +
> + /*
> + * every three bits in the Node ID mapping register maps
> + * to a particular node.
> + */
> + for (i = 0; i < 8; i++) {
> + if (nodeid == ((config >> (3 * i)) & 0x7)) {
> + if (topology_max_die_per_package() > 1)
> + die_id = i;
> + else
> + die_id = topology_phys_to_logical_pkg(i);
> + map->pbus_to_dieid[bus] = die_id;
> + break;
> + }
> + }
> raw_spin_unlock(&pci2phy_map_lock);
> + } else {
> + int node = pcibus_to_node(ubox_dev->bus);
> + int cpu;
> +
> + segment = pci_domain_nr(ubox_dev->bus);
> + raw_spin_lock(&pci2phy_map_lock);
> + map = __find_pci2phy_map(segment);
> + if (!map) {
> + raw_spin_unlock(&pci2phy_map_lock);
> + err = -ENOMEM;
> + break;
> + }
> + die_id = -1;
> + for_each_cpu(cpu, cpumask_of_pcibus(ubox_dev->bus)) {
> + struct cpuinfo_x86 *c = &cpu_data(cpu);
> +
> + if (c->initialized && cpu_to_node(cpu) == node) {
> + map->pbus_to_dieid[bus] = die_id = c->logical_die_id;
> + break;
> + }
> + }
> + raw_spin_unlock(&pci2phy_map_lock);
> +
> + if (WARN_ON_ONCE(die_id == -1)) {
> + err = -EINVAL;
> break;
> }
This seems to assume a single die per node; is that fundemantally
correct?
Did you consider malicious BIOS data? I think we're good, but I didn't
look too hard.
> }
> }
>
> if (!err) {
> --
> 2.26.2
>
Powered by blists - more mailing lists