lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e967608f-ac8a-7a9c-35c5-821b6842653c@amd.com>
Date:   Fri, 16 Jun 2023 16:30:49 -0700
From:   Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>
To:     Lukas Wunner <lukas@...ner.de>
Cc:     linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
        Bjorn Helgaas <bhelgaas@...gle.com>, oohall@...il.com,
        Mahesh J Salgaonkar <mahesh@...ux.ibm.com>,
        Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        Yazen Ghannam <yazen.ghannam@....com>,
        Fontenot Nathan <Nathan.Fontenot@....com>
Subject: Re: [PATCH v2 1/2] PCI: pciehp: Add support for async hotplug with
 native AER and DPC/EDR

On 6/16/2023 10:31 AM, Lukas Wunner wrote:
> On Mon, May 22, 2023 at 03:23:57PM -0700, Smita Koralahalli wrote:
>> On 5/16/2023 3:10 AM, Lukas Wunner wrote:
>>> On Tue, Apr 18, 2023 at 09:05:25PM +0000, Smita Koralahalli wrote:

> 
> I'd recommend clearing only PCI_EXP_DEVSTA_FED in PCI_EXP_DEVSTA.
> 
> As for PCI_EXP_DPC_RP_PIO_STATUS, PCIe r6.1 sec 2.9.3 says that
> during DPC, either UR or CA completions are returned depending on
> the DPC Completion Control bit in the DPC Control register.
> The kernel doesn't touch that bit, so it will contain whatever value
> the BIOS has set. It seems fine to me to just clear all bits in
> PCI_EXP_DPC_RP_PIO_STATUS, as you've done in your patch.
> 
> However, the RP PIO Status register is present only in Root Ports
> that support RP Extensions for DPC, per PCIe r6.1 sec 7.9.14.6.
> So you need to constrain that to "if (pdev->dpc_rp_extensions)".
>

Okay will make changes.

> 
>>>> +	pci_aer_raw_clear_status(pdev);
>>>> +	pci_clear_surpdn_errors(pdev);
>>>> +
>>>> +	pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_STATUS,
>>>> +			      PCI_EXP_DPC_STATUS_TRIGGER);
>>>> +}
>>>
>>> Do you need a "wake_up_all(&dpc_completed_waitqueue);" at the end
>>> of the function to wake up a pciehp handler waiting for DPC recovery?
>>
>> I don't think so. The pciehp handler is however getting invoked
>> simultaneously due to PDSC or DLLSC state change right.. Let me know if I'm
>> missing anything here.
> 
> I think you need to follow the procedure in dpc_reset_link().
> 
> That function first waits for the link to go down, in accordance with
> PCIe r6.1 sec 6.2.11:
> 
> 	if (!pcie_wait_for_link(pdev, false))
> 	...
> 
> Note that the link should not come back up due to a newly hot-added
> device until DPC Trigger Status is cleared.
> 
> The function then waits for the Root Port to quiesce:
> 
> 	if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev))
> 	...
> 
> And only then does the function clear DPC Trigger Status.
> 
> You definitely need to wake_up_all(&dpc_completed_waitqueue) because
> pciehp may be waiting for DPC Trigger Status to clear.
> 
> And you need to "clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags)"
> before calling wake_up_all().
> 
>

Okay. I did not consider the fact that pciehp handler "may" wait on DPC
Trigger Status to be cleared. Because in my case both the handlers were
invoked due to their respective bit changes and I did not come across 
the case where pciehp handler was waiting on DPC to complete.


>>>> +static bool dpc_is_surprise_removal(struct pci_dev *pdev)
>>>> +{
>>>> +	u16 status;
>>>> +
>>>> +	pci_read_config_word(pdev, pdev->aer_cap + PCI_ERR_UNCOR_STATUS, &status);
>>>> +
>>>> +	if (!(status & PCI_ERR_UNC_SURPDN))
>>>> +		return false;
>>>> +
>>>
>>> You need an additional check for pdev->is_hotplug_bridge here.
>>>
>>> And you need to read PCI_EXP_SLTCAP and check for PCI_EXP_SLTCAP_HPS.
>>>
>>> Return false if either of them isn't set.
>>
>> Return false, if PCI_EXP_SLTCAP isn't set only correct? PCI_EXP_SLTCAP_HPS
>> should be disabled if DPC is enabled.
>>
>> Implementation notes in 6.7.6 says that:
>> "The Hot-Plug Surprise (HPS) mechanism, as indicated by the Hot-Plug
>> Surprise bit in the Slot Capabilities Register being Set, is deprecated
>> for use with async hot-plug. DPC is the recommended mechanism for supporting
>> async hot-plug."
>>
>> Platform FW will disable the SLTCAP_HPS bit at boot time to enable async
>> hotplug on AMD devices.
> 
> Huh, is PCI_EXP_SLTCAP_HPS not set on the hotplug port in question?
> 
> If it's not set, why do you get Surprise Down Errors in the first place?
> 
> How do you bring down the slot without surprise-removal capability?
> Via sysfs?
>

As per SPEC 6.7.6, "Either Downstream Port Containment (DPC) or the 
Hot-Plug Surprise (HPS) mechanism may be used to support async removal 
as part of an overall async hot-plug architecture".

Also, the implementation notes below, it conveys that HPS is deprecated 
and DPC is recommended mechanism. More details can be found in Appendix 
I, I.1 Async Hot-Plug Initial Configuration:
...
If DPC capability then,
	If HPS bit not Set, use DPC
	Else attempt to Clear HPS bit (ยง Section 6.7.4.4 )
		If successful, use DPC
		Else use HPS
...

So, this is "likely" a new feature support patch where DPC supports 
async remove. HPS bit will be disabled by BIOS if DPC is chosen as 
recommended mechanism to handle async removal.

I see the slot is being brought down by PDC or DLLSC event, which is 
triggered alongside DPC.

pciehp_handle_presence_or_link_change() -> pciehp_disable_slot() -> 
__pciehp_disable_slot() -> remove_board()..

But I want to clear one thing, are you implying that PDC or DLLSC 
shouldn't be triggered when HPS is disabled?

Thanks,
Smita

> 
>> Probably check if SLTCAP_HPS bit is set and return false?
> 
> Quite the opposite!  If it's not set, return false.
> 
> 
> Thanks,
> 
> Lukas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ