[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7345c2d2-5446-49a6-9ceb-0f1b9ee4ec18@nvidia.com>
Date: Thu, 11 Jan 2024 19:14:54 +0530
From: Vidya Sagar <vidyas@...dia.com>
To: Lukas Wunner <lukas@...ner.de>
Cc: bhelgaas@...gle.com, alex.williamson@...hat.com, treding@...dia.com,
jonathanh@...dia.com, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, vsethi@...dia.com, kthota@...dia.com,
mmaddireddy@...dia.com, sagar.tv@...il.com
Subject: Re: [PATCH V3] PCI: pciehp: Disable ACS Source Validation during
hot-remove
On 1/8/2024 7:49 PM, Lukas Wunner wrote:
> External email: Use caution opening links or attachments
>
>
> On Thu, Jan 04, 2024 at 08:01:06PM +0530, Vidya Sagar wrote:
>> On 8/1/2023 1:29 AM, Lukas Wunner wrote:
>>> As an alternative to disabling ACS, have you explored masking ACS
>>> Violations (PCI_ERR_UNC_ACSV) upon de-enumeration of a device and
>>> unmasking them after assignment of a bus number?
>>
>> I explored this option and it seemed to work as expected. But, the issue
>> is that this works only if the AER registers are owned by the OS. If the
>> AER registers are owned by the firmware (i.e. Firmware-First approach of
>> handling the errors), OS is not supposed to access the AER registers and
>> there is no indication from the OS to the firmware as to when the
>> enumeration is completed and time is apt to unmask the ACSViolation
>> errors in the AER's Uncorrectable Error Mask register.
>> Any thoughts on accommodating the Firmware-First approach also?
>
> Are you actually using firmware-controlled AER or is it a theoretical
> question?
Yes. We indeed have a system with Firmware-Controlled AER.
>
> PCI Firmware Spec r3.3 sec 4.6.12 talks about a _DSM to disable DPC
> on surprise-hotplug-capable ports. Maybe that would be an option?
It looks like this _DSM is totally dependent on the port having SFI
capability implemented and unfortunately our system doesn't have
SFI implemented.
>
> BTW what happens if the system resumes from sleep and a device in
> a hotplug-capable port doesn't have a bus number configured yet
> (because it's been powered off and is now in D0uninitialized state)?
Theoretically the answer seems to be yes, but, since the platform we
have is a server platform, there is no support for sleep and resume on
this platform and hence can't really confirm this behavior though.
> Could the ACS Violations then occur as well? Do we have to mask
> ACS Violations *generally* on Root Ports and Downstream Ports when
> going to system sleep and unmask them after setting a bus number
> in the attached device on resume? And I suppose that would not
> only be necessary for hotplug ports?
Again, how to do that in a system where AER is not handled natively in
the OS? AFAIU, there is no mechanism for the OS to inform about the time
it updates the bus number.
>
> Thanks,
>
> Lukas
Powered by blists - more mailing lists