[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <79e86754-710a-4335-8a09-a756201f7ae4@intel.com>
Date: Tue, 17 Jun 2025 09:08:27 -0700
From: Dave Jiang <dave.jiang@...el.com>
To: Lukas Wunner <lukas@...ner.de>, "Bowman, Terry" <terry.bowman@....com>
Cc: dave@...olabs.net, jonathan.cameron@...wei.com,
alison.schofield@...el.com, vishal.l.verma@...el.com, ira.weiny@...el.com,
dan.j.williams@...el.com, bhelgaas@...gle.com, bp@...en8.de,
ming.li@...omail.com, shiju.jose@...wei.com, dan.carpenter@...aro.org,
Smita.KoralahalliChannabasappa@....com, kobayashi.da-06@...itsu.com,
rrichter@....com, peterz@...radead.org,
fabio.m.de.francesco@...ux.intel.com, ilpo.jarvinen@...ux.intel.com,
yazen.ghannam@....com, linux-cxl@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org
Subject: Re: [PATCH v9 04/16] PCI/AER: Dequeue forwarded CXL error
On 6/10/25 9:38 PM, Lukas Wunner wrote:
> On Tue, Jun 10, 2025 at 04:20:53PM -0500, Bowman, Terry wrote:
>> On 6/10/2025 1:07 PM, Bowman, Terry wrote:
>>> On 6/9/2025 11:15 PM, Lukas Wunner wrote:
>>>> On Tue, Jun 03, 2025 at 12:22:27PM -0500, Terry Bowman wrote:
>>>>> --- a/drivers/cxl/core/ras.c
>>>>> +++ b/drivers/cxl/core/ras.c
>>>>> +static int cxl_rch_handle_error_iter(struct pci_dev *pdev, void *data)
>>>>> +{
>>>>> + struct cxl_prot_error_info *err_info = data;
>>>>> + struct pci_dev *pdev_ref __free(pci_dev_put) = pci_dev_get(pdev);
>>>>> + struct cxl_dev_state *cxlds;
>>>>> +
>>>>> + /*
>>>>> + * The capability, status, and control fields in Device 0,
>>>>> + * Function 0 DVSEC control the CXL functionality of the
>>>>> + * entire device (CXL 3.0, 8.1.3).
>>>>> + */
>>>>> + if (pdev->devfn != PCI_DEVFN(0, 0))
>>>>> + return 0;
>>>>> +
>>>>> + /*
>>>>> + * CXL Memory Devices must have the 502h class code set (CXL
>>>>> + * 3.0, 8.1.12.1).
>>>>> + */
>>>>> + if ((pdev->class >> 8) != PCI_CLASS_MEMORY_CXL)
>>>>> + return 0;
>>>>> +
>>>>> + if (!is_cxl_memdev(&pdev->dev) || !pdev->dev.driver)
>>>>> + return 0;
>>>>
>>>> Is the point of the "!pdev->dev.driver" check to ascertain that
>>>> pdev is bound to cxl_pci_driver?
>>>>
>>>> If so, you need to check "if (pdev->driver != &cxl_pci_driver)"
>>>> directly (like cxl_handle_cper_event() does).
>>>>
>>>> That's because there are drivers which may bind to *any* PCI device,
>>>> e.g. vfio_pci_driver.
>>
>> Looking closer to implement this change I find the cxl_pci_driver is
>> defined static in cxl/pci.c and is unavailable to reference in
>> cxl/core/ras.c as-is. Would you like me to export cxl_pci_driver to
>> make available for this check?
>
> I'm not sure you need an export. The consumer you're introducing
> is located in core/ras.c, which is always built-in, never modular,
> hence just making it non-static and adding a declaration to cxlpci.h
> may be sufficient.
>
> An alternative would be to keep it static, but add a non-static helper
> cxl_pci_drv_bound() or something like that.
>
> I'm passing the buck to CXL maintainers for this. :)
I don't have a good solution to this. Moving the declaration of cxl_pci driver to core would be pretty messy. Perhaps doing the dance of calling try_module_get() is less messy? Or maybe Dan has a better idea....
DJ
>
>> The existing class code check guarantees it is a CXL EP. Is it not
>> safe to expect it is bound to a the CXL driver?
>
> Just checking for the pci_dev being bound seems insufficient to me
> because of the vfio_pci_driver case and potentially others.
>
> HTH,
>
> Lukas
>
Powered by blists - more mailing lists