[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9bfcc058-8fde-9b24-3d82-255004e7f057@linux.intel.com>
Date: Tue, 14 Jan 2020 16:24:34 +0300
From: "Sudarikov, Roman" <roman.sudarikov@...ux.intel.com>
To: Greg KH <gregkh@...uxfoundation.org>
Cc: peterz@...radead.org, mingo@...hat.com, acme@...nel.org,
mark.rutland@....com, alexander.shishkin@...ux.intel.com,
jolsa@...hat.com, namhyung@...nel.org,
linux-kernel@...r.kernel.org, eranian@...gle.com,
bgregg@...flix.com, ak@...ux.intel.com, kan.liang@...ux.intel.com,
alexander.antonov@...el.com
Subject: Re: [PATCH v3 1/2] perf x86: Infrastructure for exposing an Uncore
unit to PMON mapping
On 13.01.2020 17:34, Greg KH wrote:
> On Mon, Jan 13, 2020 at 04:54:43PM +0300, roman.sudarikov@...ux.intel.com wrote:
>> From: Roman Sudarikov <roman.sudarikov@...ux.intel.com>
>>
>> Intel® Xeon® Scalable processor family (code name Skylake-SP) makes
>> significant changes in the integrated I/O (IIO) architecture. The new
>> solution introduces IIO stacks which are responsible for managing traffic
>> between the PCIe domain and the Mesh domain. Each IIO stack has its own
>> PMON block and can handle either DMI port, x16 PCIe root port, MCP-Link
>> or various built-in accelerators. IIO PMON blocks allow concurrent
>> monitoring of I/O flows up to 4 x4 bifurcation within each IIO stack.
>>
>> Software is supposed to program required perf counters within each IIO
>> stack and gather performance data. The tricky thing here is that IIO PMON
>> reports data per IIO stack but users have no idea what IIO stacks are -
>> they only know devices which are connected to the platform.
>>
>> Understanding IIO stack concept to find which IIO stack that particular
>> IO device is connected to, or to identify an IIO PMON block to program
>> for monitoring specific IIO stack assumes a lot of implicit knowledge
>> about given Intel server platform architecture.
>>
>> Usage example:
>> /sys/devices/uncore_<type>_<pmu_idx>/platform_mapping
>>
>> Each Uncore unit type, by its nature, can be mapped to its own context,
>> for example:
>> 1. CHA - each uncore_cha_<pmu_idx> is assigned to manage a distinct slice
>> of LLC capacity;
>> 2. UPI - each uncore_upi_<pmu_idx> is assigned to manage one link of Intel
>> UPI Subsystem;
>> 3. IIO - each uncore_iio_<pmu_idx> is assigned to manage one stack of the
>> IIO module;
>> 4. IMC - each uncore_imc_<pmu_idx> is assigned to manage one channel of
>> Memory Controller.
>>
>> Implementation details:
>> Two callbacks added to struct intel_uncore_type to discover and map Uncore
>> units to PMONs:
>> int (*get_topology)(void)
>> int (*set_mapping)(struct intel_uncore_pmu *pmu)
>>
>> Details of IIO Uncore unit mapping to IIO PMON:
>> Each IIO stack is either DMI port, x16 PCIe root port, MCP-Link or various
>> built-in accelerators. For Uncore IIO Unit type, the platform_mapping file
>> holds bus numbers of devices, which can be monitored by that IIO PMON block
>> on each die.
>>
>> Co-developed-by: Alexander Antonov <alexander.antonov@...el.com>
>> Reviewed-by: Kan Liang <kan.liang@...ux.intel.com>
>> Signed-off-by: Alexander Antonov <alexander.antonov@...el.com>
>> Signed-off-by: Roman Sudarikov <roman.sudarikov@...ux.intel.com>
>> ---
>> arch/x86/events/intel/uncore.c | 37 +++++++++++++++++++++++++++++++++-
>> arch/x86/events/intel/uncore.h | 9 ++++++++-
>> 2 files changed, 44 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>> index 86467f85c383..2c53ad44b51f 100644
>> --- a/arch/x86/events/intel/uncore.c
>> +++ b/arch/x86/events/intel/uncore.c
>> @@ -905,6 +905,32 @@ static void uncore_types_exit(struct intel_uncore_type **types)
>> uncore_type_exit(*types);
>> }
>>
>> +static struct attribute *empty_attrs[] = {
>> + NULL,
>> +};
>> +
>> +static const struct attribute_group empty_group = {
>> + .attrs = empty_attrs,
>> +};
> What is this for? Why is it needed? It doesn't do anything?
>
>> +
>> +static ssize_t platform_mapping_show(struct device *dev,
>> + struct device_attribute *attr, char *buf)
>> +{
>> + struct intel_uncore_pmu *pmu = dev_get_drvdata(dev);
>> +
>> + return snprintf(buf, PAGE_SIZE - 1, "%s\n", pmu->mapping);
>> +}
>> +static DEVICE_ATTR_RO(platform_mapping);
> You are creating new sysfs attributes without any Documentation/ABI
> updates, which is not ok. Please fix this up for your next round of
> patches.
>
>> +static struct attribute *mapping_attrs[] = {
>> + &dev_attr_platform_mapping.attr,
>> + NULL,
>> +};
>> +
>> +static const struct attribute_group uncore_mapping_group = {
>> + .attrs = mapping_attrs,
>> +};
> ATTRIBUTE_GROUPS()?
>
> Messing around with single attribute_group lists is usually a sign that
> something is really wrong as the driver core should handle arrays of
> attribute group lists instead.
>
>
>> +
>> static int __init uncore_type_init(struct intel_uncore_type *type, bool setid)
>> {
>> struct intel_uncore_pmu *pmus;
>> @@ -950,10 +976,19 @@ static int __init uncore_type_init(struct intel_uncore_type *type, bool setid)
>> attr_group->attrs[j] = &type->event_descs[j].attr.attr;
>>
>> type->events_group = &attr_group->group;
>> - }
>> + } else
>> + type->events_group = &empty_group;
> Why???
Hi Greg,
Technically, what I'm trying to do is to add an attribute which depends on
the uncore pmu type and BIOS support. New attribute is added to the end of
the attribute groups array. It appears that the events attribute group is
optional for most of the uncore pmus for x86/intel, i.e. events_group =
NULL.
NULL element in the middle of the attribute groups array "hides" all others
attribute groups which follows that element.
To work around it, embedded NULL elements should be either removed from
the attribute groups array [1] or replaced with empty attribute; see
implementation above.
If both approaches are incorrect then please advice what would be correct
solution for that case.
[1]
https://lore.kernel.org/lkml/20191210091451.6054-3-roman.sudarikov@linux.intel.com/
Thanks,
Roman
> Didn't we fix up the x86 attributes to work properly and not mess around
> with trying to merge groups and the like? Please don't perpetuate that
> more...
>
> thanks,
>
> greg k-h
Powered by blists - more mailing lists