lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d86a7533-6f13-0c73-6c5d-168b54a0e85a@huawei.com>
Date:   Wed, 23 Jun 2021 17:59:21 +0800
From:   "liuqi (BA)" <liuqi115@...wei.com>
To:     Will Deacon <will@...nel.org>, Linuxarm <linuxarm@...wei.com>
CC:     <mark.rutland@....com>, <bhelgaas@...gle.com>,
        <linux-pci@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        <linux-kernel@...r.kernel.org>, <zhangshaokun@...ilicon.com>
Subject: Re: [PATCH v6 2/2] drivers/perf: hisi: Add driver for HiSilicon PCIe
 PMU



On 2021/6/22 1:59, Will Deacon wrote:
> On Fri, Jun 18, 2021 at 05:32:47PM +0800, liuqi (BA) wrote:
>>
>>
>> On 2021/6/18 1:57, Will Deacon wrote:
>>> On Thu, Jun 17, 2021 at 07:00:26PM +0800, liuqi (BA) wrote:
>>>>
>>>>
>>>> On 2021/6/16 21:42, Will Deacon wrote:
>>>>> Hi,
>>>>>
>>>>> On Wed, Jun 16, 2021 at 09:54:23AM +0800, liuqi (BA) wrote:
>>>>>> On 2021/6/15 17:35, Will Deacon wrote:
>>>>>>> On Tue, Jun 15, 2021 at 04:57:09PM +0800, liuqi (BA) wrote:
>>>>>>>> On 2021/6/12 0:23, Will Deacon wrote:
>>>>>>>>> On Mon, May 31, 2021 at 09:32:31PM +0800, Qi Liu wrote:
>>>>>>>>>> +	/* Process data to set unit of latency as "us". */
>>>>>>>>>> +	if (is_latency_event(idx))
>>>>>>>>>> +		return div64_u64(data * us_per_cycle, data_ext);
>>>>>>>>>> +
>>>>>>>>>> +	if (is_bus_util_event(idx))
>>>>>>>>>> +		return div64_u64(data * us_per_cycle, data_ext);
>>>>>>>>>> +
>>>>>>>>>> +	if (is_buf_util_event(idx))
>>>>>>>>>> +		return div64_u64(data, data_ext * us_per_cycle);
>>>>>>>>>
>>>>>>>>> Why do we need to do all this division in the kernel? Can't we just expose
>>>>>>>>> the underlying values and let userspace figure out what it wants to do with
>>>>>>>>> the numbers?
>>>>>>>>>
>>>>>>>> Our PMU hardware support 8 sets of counters to count bandwidth, latency and
>>>>>>>> utilization events.
>>>>>>>>
>>>>>>>> For example, when users set latency event, common counter will count delay
>>>>>>>> cycles, and extern counter count number of PCIe packets automaticly. And we
>>>>>>>> do not have a event number for counting number of PCIe packets.
>>>>>>>>
>>>>>>>> So this division cannot move to userspace tool.
>>>>>>>
>>>>>>> Why can't you expose the packet counter as an extra event to userspace?
>>>>>>>
>>>>>> Maybe I didn’t express it clearly.
>>>>>>
>>>>>> As there is no hardware event number for PCIe packets counting, extern
>>>>>> counter count packets *automaticly* when latency events is selected by
>>>>>> users.
>>>>>>
>>>>>> This means users cannot set "config=0xXX" to start packets counting event.
>>>>>> So we can only get the value of counter and extern counter in driver and do
>>>>>> the division, then pass the result to userspace.
>>>>>
>>>>> I still think it would be ideal if we could expose both values to userspace
>>>>> rather than combine them somehow. Hmm. Anyway...
>>>>>
>>>>> I struggled to figure out exactly what's being counted from the
>>>>> documentation patch (please update that). Please can you explain exactly
>>>>> what appears in the HISI_PCIE_CNT and HISI_PCIE_EXT_CNT registers for the
>>>>> different modes of operation? Without that, the ratios you've chosen to
>>>>> report seem rather arbitrary.
>>>>>
>>>>
>>>> PCIe PMU events can be devided into 2 types: one type is counted by
>>>> HISI_PCIE_CNT, the other type is counted by HISI_PCIE_EXT_CNT and
>>>> HISI_PCIE_CNT, including bandwidth events, latency events, buffer
>>>> utilization and bus utilization.
>>>>
>>>> if user sets "event=0x10, subevent=0x02", this means "latency of RX memory
>>>> read" is selected. HISI_PCIE_CNT counts total delay cycles and
>>>> HISI_PCIE_EXT_CNT counts PCIe packets number at the same time. So PMU driver
>>>> could obtain average latency by caculating: HISI_PCIE_CNT /
>>>> HISI_PCIE_EXT_CNT.
>>>>
>>>> if users sets "event=0x04, subevent=0x01", this means bandwidth of RX memory
>>>> read is selected. HISI_PCIE_CNT counts total packet data volume and
>>>> HISI_PCIE_EXT_CNT counts cycles, so PMU driver could obtain average
>>>> bandwidth by caculating: HISI_PCIE_CNT / HISI_PCIE_EXT_CNT.
>>>>
>>>> The same logic is used when calculating bus utilization and buffer
>>>> utilization. Seems I should add this part in Document patch,I 'll do this in
>>>> next version, thanks.
>>>>
>>>>> I also couldn't figure out how the latency event works. For example, I was
>>>>> assuming it would be a filter (a bit like the length), so you could say
>>>>> things like "I'm only interested in packets with a latency higher than x"
>>>>> but it doesn't look like it works that way.
>>>>>
>>>>> Thanks,
>>>>>
>>>> latency is not a filter, PCIe PMU has a group of lactency events, their
>>>> event number are within the latency_events_list, and the above explains how
>>>> latency events work.
>>>>
>>>> PMU drivers have TLP length filter for bandwidth events, users could set
>>>> like "I only interested in bandwidth of packets with TLP length bigger than
>>>> x".
>>>
>>> Thanks for the explanations, I think I get it a bit better now. But I still
>>> think we should be exposing both of the values to userspace instead of
>>> reporting the ratio from which the individual counters are then
>>> unrecoverable.
>>>
>>> It will complicate the driver slightly, but can we instead expose the
>>> events independently and then allowing scheduling some of them in groups?
>>>
>>> That way we just treat HISI_PCIE_CNT and HISI_PCIE_EXT_CNT as separate
>>> counters, but with a scheduling constraint that events in a register pair
>>> must be in the same group.
>>>
>>> Will
>>
>> Hi Will,
>>
>> I got what you mean, treating HISI_PCIE_CNT and HISI_PCIE_EXT_CNT as
>> separate counters is a great idea, but here is a hardware limitation.
>>
>> The behavior of HISI_PCIE_EXT_CNT is controlled by hardware logical, so
>> HISI_PCIE_EXT_CNT only works when latency/bandwidth/... events number are
>> set in HISI_PCIE_EVENT_CTRL. So driver cannot separate this two counters,
>> they must work together because of hardware limitation.
> 
> Why can't you express this as a scheduling constraint when adding a new
> event though? In other words:
> 
>    * If you only have a latency event, then program the latency counter and
>      "reserve" HISI_PCIE_EXT_CNT
>    * If you only have a packets event, then program HISI_PCIE_EXT_CNT and
>      "reserve" HISI_PCIE_CNT with a dummy latency event so that HISI_PCIE_EXT_CNT
>      actually counts something
>    * If the user wants to count both latency and packets, then they need to
>      create two events in the same group, at which point you can program
>      both HISI_PCIE_CNT and HISI_PCIE_EXT_CNT and return the results in the
>      two events
> 
> Will
> .
Hi Will,

Thanks for your opinion, I'll add a group of events for 
HISI_PCIE_EXT_CNT, and export ext_data to userspase.

I'll send a new version latter.

Qi
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ