[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f0241dfe-aa81-4e12-8465-683eaf8937ea@linux.alibaba.com>
Date: Mon, 23 Oct 2023 17:39:30 +0800
From: Shuai Xue <xueshuai@...ux.alibaba.com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-pci@...r.kernel.org, rdunlap@...radead.org,
mark.rutland@....com, zhuo.song@...ux.alibaba.com,
renyu.zj@...ux.alibaba.com, chengyou@...ux.alibaba.com,
kaishen@...ux.alibaba.com, helgaas@...nel.org,
yangyicong@...wei.com, will@...nel.org,
Jonathan.Cameron@...wei.com, robin.murphy@....com
Subject: Re: [PATCH v9 3/4] drivers/perf: add DesignWare PCIe PMU driver
On 2023/10/23 10:05, Baolin Wang wrote:
>
>
> On 10/22/2023 3:47 PM, Shuai Xue wrote:
>> Hi, Baolin,
>>
>> I droped your Revivewed-by tag due to that I made significant changes to this
>> patch previously, please explicty give me Revivewed-by tag again if you are
>> happy with the changes.
>
> Yes, I am happy with this version (just some nits as below), and thanks for the review from other guys. Please feel free to add:
>
> Reviewed-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
Thank you.
Best Regards,
Shuai
>
>> On 2023/10/20 21:42, Shuai Xue wrote:
>>> This commit adds the PCIe Performance Monitoring Unit (PMU) driver support
>>> for T-Head Yitian SoC chip. Yitian is based on the Synopsys PCI Express
>>> Core controller IP which provides statistics feature. The PMU is a PCIe
>>> configuration space register block provided by each PCIe Root Port in a
>>> Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error
>>> injection, and Statistics).
>>>
>>> To facilitate collection of statistics the controller provides the
>>> following two features for each Root Port:
>>>
>>> - one 64-bit counter for Time Based Analysis (RX/TX data throughput and
>>> time spent in each low-power LTSSM state) and
>>> - one 32-bit counter for Event Counting (error and non-error events for
>>> a specified lane)
>>>
>>> Note: There is no interrupt for counter overflow.
>>>
>>> This driver adds PMU devices for each PCIe Root Port. And the PMU device is
>>> named based the BDF of Root Port. For example,
>>>
>>> 30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
>>>
>>> the PMU device name for this Root Port is dwc_rootport_3018.
>>>
>>> Example usage of counting PCIe RX TLP data payload (Units of bytes)::
>>>
>>> $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/
>>>
>>> average RX bandwidth can be calculated like this:
>>>
>>> PCIe TX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window
>>>
>>> Signed-off-by: Shuai Xue <xueshuai@...ux.alibaba.com>
>>> ---
>
> [snip]
>
>>> +static u64 dwc_pcie_pmu_read_time_based_counter(struct perf_event *event)
>>> +{
>>> + struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu);
>>> + struct pci_dev *pdev = pcie_pmu->pdev;
>>> + int event_id = DWC_PCIE_EVENT_ID(event);
>>> + u16 ras_des_offset = pcie_pmu->ras_des_offset;
>>> + u32 lo, hi, ss;
>>> +
>>> + /*
>>> + * The 64-bit value of the data counter is spread across two
>>> + * registers that are not synchronized. In order to read them
>>> + * atomically, ensure that the high 32 bits match before and after
>>> + * reading the low 32 bits.
>>> + */
>>> + pci_read_config_dword(pdev, ras_des_offset +
>>> + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH, &hi);
>>> + do {
>>> + /* snapshot the high 32 bits */
>>> + ss = hi;
>>> +
>>> + pci_read_config_dword(
>>> + pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_LOW,
>>> + &lo);
>>> + pci_read_config_dword(
>>> + pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH,
>>> + &hi);
>>> + } while (hi != ss);
>>> +
>>> + /*
>>> + * The Group#1 event measures the amount of data processed in 16-byte
>>> + * units. Simplify the end-user interface by multiplying the counter
>>> + * at the point of read.
>>> + */
>>> + if (event_id >= 0x20 && event_id <= 0x23)
>>> + return (((u64)hi << 32) | lo) << 4;
>>> + else
>
> You can drop the 'else'.
Agreed, will fix it in next version.
>>> +
>>> + event->cpu = pcie_pmu->on_cpu;
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static void dwc_pcie_pmu_set_period(struct hw_perf_event *hwc)
>>> +{
>>> + local64_set(&hwc->prev_count, 0);
>>> +}
>
> Only dwc_pcie_pmu_event_start() will call this small function, why just remove this function and move local64_set() into dwc_pcie_pmu_event_start()?
Good suggestion, will fix it.
Powered by blists - more mailing lists