[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <eb289ebb-1d9f-a5e9-6c03-4ddd3d343547@oracle.com>
Date: Tue, 20 Sep 2022 11:25:00 -0700
From: Kin Cho <kin.cho@...cle.com>
To: kan.liang@...ux.intel.com, peterz@...radead.org, mingo@...nel.org,
acme@...hat.com, linux-kernel@...r.kernel.org
Cc: alexander.shishkin@...ux.intel.com, jolsa@...hat.com,
eranian@...gle.com, namhyung@...nel.org, ak@...ux.intel.com
Subject: Re: [PATCH V2 0/5] Uncore PMON discovery mechanism support
Hi Kan,
We're seeing the warning below from uncore_insert_box_info on SPR.
I added a debug print:
/* Parsing Unit Discovery State */
for (i = 0; i < global.max_units; i++) {
..
uncore_insert_box_info(&unit, die, *parsed);
>> pr_info("%d 0x%llx\n", i, unit.ctl);
and here's the output:
[ 17.758579] intel_uncore: 0 0x2fc0
[ 17.763117] intel_uncore: 2 0x2010
..
[ 17.935286] intel_uncore: 65 0x87e410a0
[ 17.940308] intel_uncore: 66 0x87e21318
[ 17.945331] ------------[ cut here ]------------
[ 17.946305] WARNING: CPU: 65 PID: 1 at
arch/x86/events/intel/uncore_discovery.c:184
intel_uncore_has_discovery_tables+0x4c0/0x65c
..
[ 18.161512] intel_uncore: 67 0x87e410a0
[ 18.166533] intel_uncore: 68 0x87e21318
..
Any suggestions?
-kin
[ 17.945331] ------------[ cut here ]------------
[ 17.946305] WARNING: CPU: 65 PID: 1 at
arch/x86/events/intel/uncore_discovery.c:184
intel_uncore_has_discovery_tables+0x4c0/0x65c
[ 17.946305] Modules linked in:
[ 17.946305] CPU: 65 PID: 1 Comm: swapper/0 Not tainted
5.4.17-2136.313.1-X10-2c+ #4
[ 17.946305] Hardware name: Oracle Corporation
sca-x102c-107-sp/PCA,MB,X10-2c, BIOS 79805101 09/13/2022
[ 17.946305] RIP: 0010:intel_uncore_has_discovery_tables+0x4c0/0x65c
[ 17.946305] Code: 38 48 63 f0 48 8d 3c b1 45 8b 04 b0 44 89 07 4c 8b
42 40 45 8b 04 b0 45 89 04 b1 0f b7 75 ca 3b 37 75 cf 4c 89 8d 68 ff ff
ff <0f> 0b 48 89 cf e8 c6 4f 2b 00 4c 8b 8d 68 ff ff ff 4c 89 cf e8 b7
[ 17.946305] RSP: 0000:ff4b04f60006bd08 EFLAGS: 00010246
[ 17.946305] RAX: 0000000000000002 RBX: 0000000000000044 RCX:
ff43a98a4ff1bb30
[ 17.946305] RDX: ff43a98a4ff294e0 RSI: 0000000000000003 RDI:
ff43a98a4ff1bb38
[ 17.946305] RBP: ff4b04f60006bdb0 R08: 0000000000018000 R09:
ff43a98a4ff1b310
[ 17.946305] R10: 0000000000000005 R11: ff43a98c7f7fe000 R12:
ff43a98777d66000
[ 17.946305] R13: 0000000000015240 R14: ff4b04f61b286000 R15:
0000000000000043
[ 17.946305] FS: 0000000000000000(0000) GS:ff43a90a7f840000(0000)
knlGS:0000000000000000
[ 17.946305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 17.946305] CR2: 0000000000000000 CR3: 000000967e00a001 CR4:
0000000000761ee0
[ 17.946305] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 17.946305] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7:
0000000000000400
[ 17.946305] PKRU: 55555554
[ 17.946305] Call Trace:
[ 17.946305] ? uncore_types_init+0x25f/0x25f
[ 17.946305] intel_uncore_init+0x64/0x50c
[ 17.946305] ? perf_pmu_register+0x2cc/0x403
[ 17.946305] ? uncore_types_init+0x25f/0x25f
[ 17.946305] do_one_initcall+0x52/0x1e1
[ 17.946305] ? trace_event_define_fields_initcall_level+0x2a/0x36
[ 17.946305] kernel_init_freeable+0x1fc/0x2a7
[ 17.946305] ? loglevel+0x5d/0x5d
[ 17.946305] ? rest_init+0xb0/0xb0
[ 17.946305] kernel_init+0xe/0x123
[ 17.946305] ret_from_fork+0x24/0x36
[ 17.946305] ---[ end trace d9131e47b8a615f4 ]---
On 3/17/21 10:59 AM, kan.liang@...ux.intel.com wrote:
> From: Kan Liang <kan.liang@...ux.intel.com>
>
> Changes since V1:
> - Use the generic rbtree functions, rb_add() and rb_find(). (Patch 1)
> - Add a module parameter, uncore_no_discover. If users don't want the
> discovery feature, they can set uncore_no_discover=true. (Patch 1)
>
>
> A mechanism of self-describing HW for the uncore PMOM has been
> introduced with the latest Intel platforms. By reading through an MMIO
> page worth of information, SW can ‘discover’ all the standard uncore
> PMON registers.
>
> With the discovery mechanism, Perf can
> - Retrieve the generic uncore unit information of all standard uncore
> blocks, e.g., the address of counters, the address of the counter
> control, the counter width, the access type, etc.
> Perf can provide basic uncore support based on this information.
> For a new platform, perf users will get basic uncore support even if
> the platform-specific enabling code is not ready yet.
> - Retrieve accurate uncore unit information, e.g., the number of uncore
> boxes. The number of uncore boxes may be different among machines.
> Current perf hard code the max number of the uncore blocks. On some
> machines, perf may create a PMU for an unavailable uncore block.
> Although there is no harm (always return 0 for the unavailable uncore
> block), it may confuse the users. The discovery mechanism can provide
> the accurate number of available uncore boxes on a machine.
>
> But, the discovery mechanism has some limits,
> - Rely on BIOS's support. If a BIOS doesn't support the discovery
> mechanism, the uncore driver will exit with -ENODEV. There is nothing
> changed.
> - Only provide the generic uncore unit information. The information for
> the advanced features, such as fixed counters, filters, and
> constraints, cannot be retrieved.
> - Only support the standard PMON blocks. Non-standard PMON blocks, e.g.,
> free-running counters, are not supported.
> - Only provide an ID for an uncore block. No meaningful name is
> provided. The uncore_type_&typeID_&boxID will be used as the name.
> - Enabling the PCI and MMIO type of uncore blocks rely on the NUMA support.
> These uncore blocks require the mapping information from a BUS to a
> die. The current discovery table doesn't provide the mapping
> information. The pcibus_to_node() from NUMA is used to retrieve the
> information. If NUMA is not supported, some uncore blocks maybe
> unavailable.
>
> To locate the MMIO page, SW has to find a PCI device with the unique
> capability ID 0x23 and retrieve its BAR address.
>
> The spec can be found at Snow Ridge or Ice Lake server's uncore document.
> https://cdrdv2.intel.com/v1/dl/getContent/611319
>
> Kan Liang (5):
> perf/x86/intel/uncore: Parse uncore discovery tables
> perf/x86/intel/uncore: Generic support for the MSR type of uncore
> blocks
> perf/x86/intel/uncore: Rename uncore_notifier to
> uncore_pci_sub_notifier
> perf/x86/intel/uncore: Generic support for the PCI type of uncore
> blocks
> perf/x86/intel/uncore: Generic support for the MMIO type of uncore
> blocks
>
> arch/x86/events/intel/Makefile | 2 +-
> arch/x86/events/intel/uncore.c | 188 ++++++++--
> arch/x86/events/intel/uncore.h | 10 +-
> arch/x86/events/intel/uncore_discovery.c | 622 +++++++++++++++++++++++++++++++
> arch/x86/events/intel/uncore_discovery.h | 131 +++++++
> 5 files changed, 922 insertions(+), 31 deletions(-)
> create mode 100644 arch/x86/events/intel/uncore_discovery.c
> create mode 100644 arch/x86/events/intel/uncore_discovery.h
>
Powered by blists - more mailing lists