[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b746288b-32f0-4f08-8c9a-6adbe195d429@nvidia.com>
Date: Thu, 18 Jan 2024 07:57:10 +0530
From: Vidya Sagar <vidyas@...dia.com>
To: Lukas Wunner <lukas@...ner.de>, bhelgaas@...gle.com
Cc: alex.williamson@...hat.com, treding@...dia.com, jonathanh@...dia.com,
linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org, vsethi@...dia.com,
kthota@...dia.com, mmaddireddy@...dia.com, sagar.tv@...il.com
Subject: Re: [PATCH V3] PCI: pciehp: Disable ACS Source Validation during
hot-remove
Hi Lucas/Bjorn, any thoughts on this?
On 1/11/2024 7:14 PM, Vidya Sagar wrote:
>
>
> On 1/8/2024 7:49 PM, Lukas Wunner wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On Thu, Jan 04, 2024 at 08:01:06PM +0530, Vidya Sagar wrote:
>>> On 8/1/2023 1:29 AM, Lukas Wunner wrote:
>>>> As an alternative to disabling ACS, have you explored masking ACS
>>>> Violations (PCI_ERR_UNC_ACSV) upon de-enumeration of a device and
>>>> unmasking them after assignment of a bus number?
>>>
>>> I explored this option and it seemed to work as expected. But, the issue
>>> is that this works only if the AER registers are owned by the OS. If the
>>> AER registers are owned by the firmware (i.e. Firmware-First approach of
>>> handling the errors), OS is not supposed to access the AER registers and
>>> there is no indication from the OS to the firmware as to when the
>>> enumeration is completed and time is apt to unmask the ACSViolation
>>> errors in the AER's Uncorrectable Error Mask register.
>>> Any thoughts on accommodating the Firmware-First approach also?
>>
>> Are you actually using firmware-controlled AER or is it a theoretical
>> question?
> Yes. We indeed have a system with Firmware-Controlled AER.
>
>>
>> PCI Firmware Spec r3.3 sec 4.6.12 talks about a _DSM to disable DPC
>> on surprise-hotplug-capable ports. Maybe that would be an option?
> It looks like this _DSM is totally dependent on the port having SFI
> capability implemented and unfortunately our system doesn't have
> SFI implemented.
>
>>
>> BTW what happens if the system resumes from sleep and a device in
>> a hotplug-capable port doesn't have a bus number configured yet
>> (because it's been powered off and is now in D0uninitialized state)?
> Theoretically the answer seems to be yes, but, since the platform we
> have is a server platform, there is no support for sleep and resume on
> this platform and hence can't really confirm this behavior though.
>
>> Could the ACS Violations then occur as well? Do we have to mask
>> ACS Violations *generally* on Root Ports and Downstream Ports when
>> going to system sleep and unmask them after setting a bus number
>> in the attached device on resume? And I suppose that would not
>> only be necessary for hotplug ports?
> Again, how to do that in a system where AER is not handled natively in
> the OS? AFAIU, there is no mechanism for the OS to inform about the time
> it updates the bus number.
>
>>
>> Thanks,
>>
>> Lukas
Powered by blists - more mailing lists