lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <44BE3112-64C2-4A64-B9A6-6DD466DC594D@intel.com>
Date:   Fri, 11 Sep 2020 16:16:03 -0700
From:   "Sean V Kelley" <sean.v.kelley@...el.com>
To:     "Bjorn Helgaas" <helgaas@...nel.org>
Cc:     "Zhuo, Qiuxu" <qiuxu.zhuo@...el.com>, Jonathan.Cameron@...wei.com,
        rjw@...ysocki.net, sathyanarayanan.kuppuswamy@...ux.intel.com,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "Luck, Tony" <tony.luck@...el.com>, linux-pci@...r.kernel.org,
        bhelgaas@...gle.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 04/10] PCI/RCEC: Add pcie_walk_rcec() to walk
 associated RCiEPs

On 4 Sep 2020, at 19:23, Bjorn Helgaas wrote:

> On Fri, Sep 04, 2020 at 10:18:30PM +0000, Kelley, Sean V wrote:
>> Hi Bjorn,
>>
>> Quick question below...
>>
>> On Wed, 2020-09-02 at 14:55 -0700, Sean V Kelley wrote:
>>> Hi Bjorn,
>>>
>>> On Wed, 2020-09-02 at 14:00 -0500, Bjorn Helgaas wrote:
>>>> On Wed, Aug 12, 2020 at 09:46:53AM -0700, Sean V Kelley wrote:
>>>>> From: Qiuxu Zhuo <qiuxu.zhuo@...el.com>
>>>>>
>>>>> When an RCEC device signals error(s) to a CPU core, the CPU core
>>>>> needs to walk all the RCiEPs associated with that RCEC to check
>>>>> errors. So add the function pcie_walk_rcec() to walk all RCiEPs
>>>>> associated with the RCEC device.
>>>>>
>>>>> Co-developed-by: Sean V Kelley <sean.v.kelley@...el.com>
>>>>> Signed-off-by: Sean V Kelley <sean.v.kelley@...el.com>
>>>>> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@...el.com>
>>>>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
>>>>> ---
>>>>>  drivers/pci/pci.h       |  4 +++
>>>>>  drivers/pci/pcie/rcec.c | 76
>>>>> +++++++++++++++++++++++++++++++++++++++++
>>>>>  2 files changed, 80 insertions(+)
>>>>>
>>>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>>>> index bd25e6047b54..8bd7528d6977 100644
>>>>> --- a/drivers/pci/pci.h
>>>>> +++ b/drivers/pci/pci.h
>>>>> @@ -473,9 +473,13 @@ static inline void pci_dpc_init(struct
>>>>> pci_dev
>>>>> *pdev) {}
>>>>>  #ifdef CONFIG_PCIEPORTBUS
>>>>>  void pci_rcec_init(struct pci_dev *dev);
>>>>>  void pci_rcec_exit(struct pci_dev *dev);
>>>>> +void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct
>>>>> pci_dev
>>>>> *, void *),
>>>>> +		    void *userdata);
>>>>>  #else
>>>>>  static inline void pci_rcec_init(struct pci_dev *dev) {}
>>>>>  static inline void pci_rcec_exit(struct pci_dev *dev) {}
>>>>> +static inline void pcie_walk_rcec(struct pci_dev *rcec, int
>>>>> (*cb)(struct pci_dev *, void *),
>>>>> +				  void *userdata) {}
>>>>>  #endif
>>>>>
>>>>>  #ifdef CONFIG_PCI_ATS
>>>>> diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
>>>>> index 519ae086ff41..405f92fcdf7f 100644
>>>>> --- a/drivers/pci/pcie/rcec.c
>>>>> +++ b/drivers/pci/pcie/rcec.c
>>>>> @@ -17,6 +17,82 @@
>>>>>
>>>>>  #include "../pci.h"
>>>>>
>>>>> +static int pcie_walk_rciep_devfn(struct pci_bus *bus, int
>>>>> (*cb)(struct pci_dev *, void *),
>>>>> +				 void *userdata, const unsigned long
>>>>> bitmap)
>>>>> +{
>>>>> +	unsigned int devn, fn;
>>>>> +	struct pci_dev *dev;
>>>>> +	int retval;
>>>>> +
>>>>> +	for_each_set_bit(devn, &bitmap, 32) {
>>>>> +		for (fn = 0; fn < 8; fn++) {
>>>>> +			dev = pci_get_slot(bus, PCI_DEVFN(devn, fn));
>>>>
>>>> Wow, this is a lot of churning to call pci_get_slot() 256 times per
>>>> bus for the "associated bus numbers" case where we pass a bitmap of
>>>> 0xffffffff.  They didn't really make it easy for software when they
>>>> added the next/last bus number thing.
>>>>
>>>> Just thinking out loud here.  What if we could set dev->rcec during
>>>> enumeration, and then use that to build pcie_walk_rcec()?
>>>
>>> I think follow what you are doing.
>>>
>>> As we enumerate an RCEC, use the time to discover RCiEPs and
>>> associate
>>> each RCiEP's dev->rcec. Although BIOS already set the bitmap for 
>>> this
>>> specific RCEC, it's more efficient to simply discover the devices
>>> through the bus walk and verify each one found against the bitmap.
>>>
>>> Further, while we can be certain that an RCiEP found with a matching
>>> device no. in a bitmap for an associated RCEC is correct, we cannot
>>> be
>>> certain that any RCiEP found on another bus range is correct unless
>>> we
>>> verify the bus is within that next/last bus range.
>>>
>>> Finally, that's where find_rcec() callback for rcec_assoc_rciep()
>>> does
>>> double duty by also checking on the "on-a-separate-bus" case 
>>> captured
>>> potentially by find_rcec() during an RCiEP's bus walk.
>>>
>>>
>>>>   bool rcec_assoc_rciep(rcec, rciep)
>>>>   {
>>>>     if (rcec->bus == rciep->bus)
>>>>       return (rcec->bitmap contains rciep->devfn);
>>>>
>>>>     return (rcec->next/last contains rciep->bus);
>>>>   }
>>>>
>>>>   link_rcec(dev, data)
>>>>   {
>>>>     struct pci_dev *rcec = data;
>>>>
>>>>     if ((dev is RCiEP) && rcec_assoc_rciep(rcec, dev))
>>>>       dev->rcec = rcec;
>>>>   }
>>>>
>>>>   find_rcec(dev, data)
>>>>   {
>>>>     struct pci_dev *rciep = data;
>>>>
>>>>     if ((dev is RCEC) && rcec_assoc_rciep(dev, rciep))
>>>>       rciep->rcec = dev;
>>>>   }
>>>>
>>>>   pci_setup_device
>>>>     ...
>>
>> I just noticed your use of pci_setup_device(). Are you suggesting
>> moving the call to pci_rcec_init() out of pci_init_capabilities() and
>> move it into pci_setup_device()?  If so, would pci_rcec_exit() still
>> remain in pci_release_capabilities()?
>>
>> I'm just wondering if it could just remain in 
>> pci_init_capabilities().
>
> Yeah, I didn't mean in pci_setup_device() specifically, just somewhere
> in the callchain of pci_setup_device().  But you're right, it probably
> would make more sense in pci_init_capabilities(), so I *should* have
> said pci_scan_single_device() to be a little less specific.
>
> Bjorn

I’ve done some experimenting with this approach, and I think there may 
be a problem of just walking the busses during enumeration 
pci_init_capabilities(). One problem is where one has an RCEC on a root 
bus: 6a(00.4) and an RCiEP on another root bus: 6b(00.0).  They will 
never find each other in this approach through a normal pci_bus_walk() 
call using their respective root_bus.

>  +-[0000:6b]-+-00.0
>  |           +-00.1
>  |           +-00.2
>  |           \-00.3
>  +-[0000:6a]-+-00.0
>  |           +-00.1
>  |           +-00.2
>  |           \-00.4

While having a lot of slot calls per bus is unfortunate, unless I’m 
mistaken you would have to walk every peer root_bus with your RCiEP in 
this example until you hit on the right RCEC, unless of course you have 
a bitmap associated RCEC on dev->bus.

Conversely, if you are enumerating the above RCEC at 6a(00.4) and you 
attempt to link_rcec() through calls to pci_walk_bus(), the walk will 
still be limited to 6a and below; never encountering 6b(00.0).  So you 
would then need an additional walk for each of the associated bus 
ranges, excluding the same bus as the RCEC.

pci_init_capabilities()
…
pci_init_rcec() // Cached

if (RCEC)
  Walk the dev->bus for bitmap associated RCiEP
  Walk all associated bus ranges for RCiEP

else if (RCiEP)
  Walk the dev->bus for bitmap associated RCEC
  Walk all peer root_bus for RCEC, confirm if own dev->bus falls within 
discovered RCEC associated ranges

The other problem here is temporal. I’m wondering if we may be trying 
to find associated devices at the pci_init_capabilities() stage prior to 
them being fully enumerated, i.e., RCEC has not been cached but we are 
searching with a future associated RCiEP.  So one could encounter a race 
condition where one is checking on an RCiEP whose associated RCEC has 
not been enumerated yet.

So let’s say one throws out RCiEP entirely and just relies upon RCEC 
to find the associations because one knows that an encountered RCEC (in 
pci_init_capabilities()) has already been cached. In that case you end 
up with the original implementation being done with this patch series…

if (RCEC)
  Walk the dev->bus for bitmap associated RCiEP
  Walk all associated bus ranges for RCiEP

Perhaps I’ve muddled some things here but it doesn’t look like the 
twain will meet unless I cover multiple peer root_bus and even then you 
may have an issue because the devices don’t yet fully exist from the 
perspective of the OS.

Thanks,

Sean

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ