[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e967608f-ac8a-7a9c-35c5-821b6842653c@amd.com>
Date: Fri, 16 Jun 2023 16:30:49 -0700
From: Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>
To: Lukas Wunner <lukas@...ner.de>
Cc: linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
Bjorn Helgaas <bhelgaas@...gle.com>, oohall@...il.com,
Mahesh J Salgaonkar <mahesh@...ux.ibm.com>,
Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@...ux.intel.com>,
Yazen Ghannam <yazen.ghannam@....com>,
Fontenot Nathan <Nathan.Fontenot@....com>
Subject: Re: [PATCH v2 1/2] PCI: pciehp: Add support for async hotplug with
native AER and DPC/EDR
On 6/16/2023 10:31 AM, Lukas Wunner wrote:
> On Mon, May 22, 2023 at 03:23:57PM -0700, Smita Koralahalli wrote:
>> On 5/16/2023 3:10 AM, Lukas Wunner wrote:
>>> On Tue, Apr 18, 2023 at 09:05:25PM +0000, Smita Koralahalli wrote:
>
> I'd recommend clearing only PCI_EXP_DEVSTA_FED in PCI_EXP_DEVSTA.
>
> As for PCI_EXP_DPC_RP_PIO_STATUS, PCIe r6.1 sec 2.9.3 says that
> during DPC, either UR or CA completions are returned depending on
> the DPC Completion Control bit in the DPC Control register.
> The kernel doesn't touch that bit, so it will contain whatever value
> the BIOS has set. It seems fine to me to just clear all bits in
> PCI_EXP_DPC_RP_PIO_STATUS, as you've done in your patch.
>
> However, the RP PIO Status register is present only in Root Ports
> that support RP Extensions for DPC, per PCIe r6.1 sec 7.9.14.6.
> So you need to constrain that to "if (pdev->dpc_rp_extensions)".
>
Okay will make changes.
>
>>>> + pci_aer_raw_clear_status(pdev);
>>>> + pci_clear_surpdn_errors(pdev);
>>>> +
>>>> + pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_STATUS,
>>>> + PCI_EXP_DPC_STATUS_TRIGGER);
>>>> +}
>>>
>>> Do you need a "wake_up_all(&dpc_completed_waitqueue);" at the end
>>> of the function to wake up a pciehp handler waiting for DPC recovery?
>>
>> I don't think so. The pciehp handler is however getting invoked
>> simultaneously due to PDSC or DLLSC state change right.. Let me know if I'm
>> missing anything here.
>
> I think you need to follow the procedure in dpc_reset_link().
>
> That function first waits for the link to go down, in accordance with
> PCIe r6.1 sec 6.2.11:
>
> if (!pcie_wait_for_link(pdev, false))
> ...
>
> Note that the link should not come back up due to a newly hot-added
> device until DPC Trigger Status is cleared.
>
> The function then waits for the Root Port to quiesce:
>
> if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev))
> ...
>
> And only then does the function clear DPC Trigger Status.
>
> You definitely need to wake_up_all(&dpc_completed_waitqueue) because
> pciehp may be waiting for DPC Trigger Status to clear.
>
> And you need to "clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags)"
> before calling wake_up_all().
>
>
Okay. I did not consider the fact that pciehp handler "may" wait on DPC
Trigger Status to be cleared. Because in my case both the handlers were
invoked due to their respective bit changes and I did not come across
the case where pciehp handler was waiting on DPC to complete.
>>>> +static bool dpc_is_surprise_removal(struct pci_dev *pdev)
>>>> +{
>>>> + u16 status;
>>>> +
>>>> + pci_read_config_word(pdev, pdev->aer_cap + PCI_ERR_UNCOR_STATUS, &status);
>>>> +
>>>> + if (!(status & PCI_ERR_UNC_SURPDN))
>>>> + return false;
>>>> +
>>>
>>> You need an additional check for pdev->is_hotplug_bridge here.
>>>
>>> And you need to read PCI_EXP_SLTCAP and check for PCI_EXP_SLTCAP_HPS.
>>>
>>> Return false if either of them isn't set.
>>
>> Return false, if PCI_EXP_SLTCAP isn't set only correct? PCI_EXP_SLTCAP_HPS
>> should be disabled if DPC is enabled.
>>
>> Implementation notes in 6.7.6 says that:
>> "The Hot-Plug Surprise (HPS) mechanism, as indicated by the Hot-Plug
>> Surprise bit in the Slot Capabilities Register being Set, is deprecated
>> for use with async hot-plug. DPC is the recommended mechanism for supporting
>> async hot-plug."
>>
>> Platform FW will disable the SLTCAP_HPS bit at boot time to enable async
>> hotplug on AMD devices.
>
> Huh, is PCI_EXP_SLTCAP_HPS not set on the hotplug port in question?
>
> If it's not set, why do you get Surprise Down Errors in the first place?
>
> How do you bring down the slot without surprise-removal capability?
> Via sysfs?
>
As per SPEC 6.7.6, "Either Downstream Port Containment (DPC) or the
Hot-Plug Surprise (HPS) mechanism may be used to support async removal
as part of an overall async hot-plug architecture".
Also, the implementation notes below, it conveys that HPS is deprecated
and DPC is recommended mechanism. More details can be found in Appendix
I, I.1 Async Hot-Plug Initial Configuration:
...
If DPC capability then,
If HPS bit not Set, use DPC
Else attempt to Clear HPS bit (ยง Section 6.7.4.4 )
If successful, use DPC
Else use HPS
...
So, this is "likely" a new feature support patch where DPC supports
async remove. HPS bit will be disabled by BIOS if DPC is chosen as
recommended mechanism to handle async removal.
I see the slot is being brought down by PDC or DLLSC event, which is
triggered alongside DPC.
pciehp_handle_presence_or_link_change() -> pciehp_disable_slot() ->
__pciehp_disable_slot() -> remove_board()..
But I want to clear one thing, are you implying that PDC or DLLSC
shouldn't be triggered when HPS is disabled?
Thanks,
Smita
>
>> Probably check if SLTCAP_HPS bit is set and return false?
>
> Quite the opposite! If it's not set, return false.
>
>
> Thanks,
>
> Lukas
Powered by blists - more mailing lists