[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <79a042c8-b508-a33b-fe69-1c19fc579161@linux.intel.com>
Date: Thu, 13 Feb 2020 15:36:44 +0300
From: "Sudarikov, Roman" <roman.sudarikov@...ux.intel.com>
To: Greg KH <gregkh@...uxfoundation.org>,
"Liang, Kan" <kan.liang@...ux.intel.com>
Cc: Andi Kleen <ak@...ux.intel.com>, peterz@...radead.org,
mingo@...hat.com, acme@...nel.org, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...hat.com,
namhyung@...nel.org, linux-kernel@...r.kernel.org,
eranian@...gle.com, bgregg@...flix.com, alexander.antonov@...el.com
Subject: Re: [PATCH v5 3/3] perf x86: Exposing an Uncore unit to PMON for Intel Xeon® server platform
On 13.02.2020 1:56, Greg KH wrote:
> On Wed, Feb 12, 2020 at 03:58:50PM -0500, Liang, Kan wrote:
>>
>> On 2/12/2020 12:31 PM, Sudarikov, Roman wrote:
>>> On 11.02.2020 23:14, Greg KH wrote:
>>>> On Tue, Feb 11, 2020 at 02:59:21PM -0500, Liang, Kan wrote:
>>>>> On 2/11/2020 1:57 PM, Greg KH wrote:
>>>>>> On Tue, Feb 11, 2020 at 10:42:00AM -0800, Andi Kleen wrote:
>>>>>>> On Tue, Feb 11, 2020 at 09:15:44AM -0800, Greg KH wrote:
>>>>>>>> On Tue, Feb 11, 2020 at 07:15:49PM +0300,
>>>>>>>> roman.sudarikov@...ux.intel.com wrote:
>>>>>>>>> +static ssize_t skx_iio_mapping_show(struct device *dev,
>>>>>>>>> + struct device_attribute *attr, char *buf)
>>>>>>>>> +{
>>>>>>>>> + struct pmu *pmu = dev_get_drvdata(dev);
>>>>>>>>> + struct intel_uncore_pmu *uncore_pmu =
>>>>>>>>> + container_of(pmu, struct intel_uncore_pmu, pmu);
>>>>>>>>> +
>>>>>>>>> + struct dev_ext_attribute *ea =
>>>>>>>>> + container_of(attr, struct dev_ext_attribute, attr);
>>>>>>>>> + long die = (long)ea->var;
>>>>>>>>> +
>>>>>>>>> + return sprintf(buf, "0000:%02x\n",
>>>>>>>>> skx_iio_stack(uncore_pmu, die));
>>>>>>>> If "0000:" is always the "prefix" of the output of
>>>>>>>> this file, why have
>>>>>>>> it at all as you always know it is there?
>>>>> I think Roman only test with BIOS configured as single-segment. So he
>>>>> hard-code the segment# here.
>>>>>
>>>>> I'm not sure if Roman can do some test with multiple-segment
>>>>> BIOS. If not, I
>>>>> think we should at least print a warning here.
>>>>>
>>>>>>>> What is ever going to cause that to change?
>>>>>>> I think it's just to make it a complete PCI address.
>>>>>> Is that what this really is? If so, it's not a "complete" pci address,
>>>>>> is it? If it is, use the real pci address please.
>>>>> I think we don't need a complete PCI address here. The attr is
>>>>> to disclose
>>>>> the mapping information between die and PCI BUS. Segment:BUS
>>>>> should be good
>>>>> enough.
>>>> "good enough" for today, but note that you can not change the format of
>>>> the data in the file in the future, you would have to create a new file.
>>>> So I suggest at least try to future-proof it as much as possible if you
>>>> _know_ this could change.
>>>>
>>>> Just use the full pci address, there's no reason not to, otherwise it's
>>>> just confusing.
>>>>
>>>> thanks,
>>>>
>>>> greg k-h
>>> Hi Greg,
>>>
>>> Yes, the "Segment:Bus" pair is enough to distinguish between different
>>> Root ports.
>> I think Greg suggests us to use full PCI address here.
>>
>> Hi Greg,
>>
>> There may be several devices are connected to IIO stack. There is no full
>> PCI address for IIO stack.
> Please define "full" for me. Please please don't tell me you are just
> using a truncated version of the PCI address. I thought we got rid of
> all of that nonsense 10 years ago...
>
>> I don't think we can list all of devices in the same IIO stack with full PCI
>> address here either. It's not necessary, and only increase maintenance
>> overhead.
> Then what exactly _IS_ this number, if not the PCI address?
>
> Something made up to look almost like a PCI address, but not quite?
> Somethine else?
>
>> I think we may have two options here.
>>
>> Option 1: Roman's proposal.The format of the file is "Segment:Bus". For the
>> future I can see, the format doesn't need to be changed.
>> E.g. $ls /sys/devices/uncore_<type>_<pmu_idx>/die0
>> $0000:7f
> Again, fake PCI address?
Hi Greg,
Actually, there are two reasons why we've chosen the "Segment:Root Bus"
notion to
represent Root port to IO PMU mapping:
1. it meets feature requirements to uniquely identify each Root Port on
the system
2. that notion - "Segment:Root Bus" - is already used by the kernel to
represent
Root ports is sysfs; see commit 37d6a0a6f4700 and example below taken for
Intel Xeon V5 (Skylake Server):
# ls /sys/devices/ | grep pci
pci0000:00
pci0000:17
pci0000:3a
pci0000:5d
pci0000:80
pci0000:85
pci0000:ae
pci0000:d7
Having full conventional PCI address in the form of
"Segment:Bus:Device.Function"
is just not required to distinguish one Root Bus from the other.
But if there is any other agreement regarding the way how PCI Root ports are
supposed to show up in the sysfs then please let us know.
Thanks,
Roman
>> Option 2: Use full PCI address, but use -1 to indicate invalid address.
>> E.g. $ls /sys/devices/uncore_<type>_<pmu_idx>/die0
>> $0000:7f:-1:-1
> "Invalid"? Why? Why not just refer to the 0:0 device, as that's the
> bus "root" address (or whatever it's called, I can't remember PCI stuff
> all that well...)
>
>> Should we use the format in option 2?
> What could userspace do with a -1 -1 address?
>
> thanks,
>
> greg k-h
Powered by blists - more mailing lists