lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 23 Sep 2022 11:56:44 +0100
From:   Jonathan Cameron <Jonathan.Cameron@...wei.com>
To:     Yicong Yang <yangyicong@...wei.com>
CC:     Bjorn Helgaas <helgaas@...nel.org>, <yangyicong@...ilicon.com>,
        "Shuai Xue" <xueshuai@...ux.alibaba.com>, <will@...nel.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        <linux-kernel@...r.kernel.org>, <rdunlap@...radead.org>,
        <robin.murphy@....com>, <mark.rutland@....com>,
        <baolin.wang@...ux.alibaba.com>, <zhuo.song@...ux.alibaba.com>,
        <linux-pci@...r.kernel.org>,
        Dan Williams <dan.j.williams@...el.com>,
        <linux-cxl@...r.kernel.org>
Subject: Re: [PATCH v1 2/3] drivers/perf: add DesignWare PCIe PMU driver

On Fri, 23 Sep 2022 11:35:45 +0800
Yicong Yang <yangyicong@...wei.com> wrote:

> On 2022/9/23 1:32, Bjorn Helgaas wrote:
> > On Thu, Sep 22, 2022 at 04:58:20PM +0100, Jonathan Cameron wrote:  
> >> On Sat, 17 Sep 2022 20:10:35 +0800
> >> Shuai Xue <xueshuai@...ux.alibaba.com> wrote:
> >>  
> >>> This commit adds the PCIe Performance Monitoring Unit (PMU) driver support
> >>> for T-Head Yitian SoC chip. Yitian is based on the Synopsys PCI Express
> >>> Core controller IP which provides statistics feature. The PMU is not a PCIe
> >>> Root Complex integrated End Point(RCiEP) device but only register counters
> >>> provided by each PCIe Root Port.
> >>>
> >>> To facilitate collection of statistics the controller provides the
> >>> following two features for each Root Port:
> >>>
> >>> - Time Based Analysis (RX/TX data throughput and time spent in each
> >>>   low-power LTSSM state)
> >>> - Event counters (Error and Non-Error for lanes)
> >>>
> >>> Note, only one counter for each type.
> >>>
> >>> This driver add PMU devices for each PCIe Root Port. And the PMU device is
> >>> named based the BDF of Root Port. For example,
> >>>
> >>>     10:00.0 PCI bridge: Device 1ded:8000 (rev 01)
> >>>
> >>> the PMU device name for this Root Port is pcie_bdf_100000.
> >>>
> >>> Example usage of counting PCIe RX TLP data payload (Units of 16 bytes)::
> >>>
> >>>     $# perf stat -a -e pcie_bdf_200/Rx_PCIe_TLP_Data_Payload/
> >>>
> >>> average RX bandwidth can be calculated like this:
> >>>
> >>>     PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window
> >>>
> >>> Signed-off-by: Shuai Xue <xueshuai@...ux.alibaba.com>  
> >>
> >> +CC linux-pci list and Bjorn.  
> > 
> > Thanks, this is definitely of interest to linux-pci.
> >   
> >> Question in here which I've been meaning to address for other reasons
> >> around how to register 'extra features' on pci ports.
> >>
> >> This particular PMU is in config space in a Vendor Specific Extended
> >> Capability.
> >>
> >> I've focused on that aspect for this review rather than the perf parts.
> >> We'll need to figure that story out first as doing this from a bus walk
> >> makes triggered of a platform driver is not the way I'd expect to see
> >> this work.  
> >   
> >>> +static int dwc_pcie_pmu_discover(struct dwc_pcie_pmu_priv *priv)
> >>> +{
> >>> +	int val, where, index = 0;
> >>> +	struct pci_dev *pdev = NULL;
> >>> +	struct dwc_pcie_info_table *pcie_info;
> >>> +
> >>> +	priv->pcie_table =
> >>> +	    devm_kcalloc(priv->dev, RP_NUM_MAX, sizeof(*pcie_info), GFP_KERNEL);
> >>> +	if (!priv->pcie_table)
> >>> +		return -EINVAL;
> >>> +
> >>> +	pcie_info = priv->pcie_table;
> >>> +	while ((pdev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pdev)) != NULL &&
> >>> +	       index < RP_NUM_MAX) {  
> >>
> >> This having a driver than then walks the pci topology to find root ports and add
> >> extra stuff to them is not a clean solution.
> >>
> >> The probing should be driven from the existing PCI driver topology.
> >> There are a bunch of new features we need to add to ports in the near future
> >> anyway - this would just be another one.
> >> Same problem exists for CXL CPMU perf devices - so far we only support those
> >> on end points, partly because we need a clean way to probe them on pci ports.
> >>
> >> Whatever we come up with there will apply here as well.  
> > 
> > I agree, I don't like to see more uses of pci_get_device() because it
> > doesn't fit the driver model at all.  For one thing, it really screws
> > up the hotplug model because this doesn't account for hot-added
> > devices and there's no clear cleanup path for removal.
> > 
> > Hotplug is likely not an issue in this particular case, but it gets
> > copied to places where it is an issue.
> > 
> > Maybe we need some kind of PCI core interface whereby drivers can
> > register their interest in VSEC and/or DVSEC capabilities.


Something along those lines works if the facility is constrained to just
VSEC / DVSEC.
 * This one is.
 * CMA / SPDM / IDE all are - but with complexity of interrupts.
   After the plumbers SPDM BoF the resulting plan would not fit in the
   same model as this driver (need to be done earlier in PCI registration
   flow I think).  I need to write up and share some notes on what we
   are planning around that to get wider feedback - but might be a few
   weeks!

Others are less well confined.
 * CXL PMU uses registers in bar space - but is hanging off a DVSEC
   description to tell you where to find it.

> >   
> 
> Considering this PMU is related to each Root Port without real backup device. I'm
> wondering whether we can extend the pcie port bus and make use of it (though it's
> currently used by the standard services).

I did that a few years back for our older PCI PMUs.  It wasn't pretty.
https://lore.kernel.org/all/20181214131055.52253-2-Jonathan.Cameron@huawei.com/

We never took that driver forwards - it was mostly useful to understand what
might work for newer hardware - we went the RCiEP route at least partly to avoid
software complexity (and because of hardware topology - counters shared by multiple
RP)

We could do something more generic along the same lines as the portdrv framework
 - that highlights some of the complexities however.
There are some nasty potential ordering issues in registering interest caused
by any attempt to make this work with modules.
I'd want to see a solution that works just as well for all the components that
might have DVSEC / VSEC entries - not just those covered by portdrv.

+CC Dan Williams and linux-cxl as they may also be interested in this discussion.

> 
> Thanks.
> 

Powered by blists - more mailing lists