lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 13 May 2024 15:15:38 +0530
From: Sourabh Jain <sourabhjain@...ux.ibm.com>
To: Krishna Kumar <krishnak@...ux.ibm.com>, mpe@...erman.id.au,
        npiggin@...il.com
Cc: nathanl@...ux.ibm.com, aneesh.kumar@...nel.org, linux-pci@...r.kernel.org,
        linux-kernel@...r.kernel.org, gbatra@...ux.ibm.com,
        bhelgaas@...gle.com, tpearson@...torengineering.com, oohall@...il.com,
        brking@...ux.vnet.ibm.com, mahesh.salgaonkar@...ibm.com,
        linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH 2/2] arch/powerpc: hotplug driver bridge support

Hello Krishna,

Is "arch" in commit title really required?

On 09/05/24 17:35, Krishna Kumar wrote:
> There is an issue with the hotplug operation when it's done on the
> bridge/switch slot. The bridge-port and devices behind the bridge, which
> become offline by hot-unplug operation, don't get hot-plugged/enabled by
> doing hot-plug operation on that slot. Only the first port of the bridge
> gets enabled and the remaining port/devices remain unplugged. The hot
> plug/unplug operation is done by the hotplug driver
> (drivers/pci/hotplug/pnv_php.c).
>
> Root Cause Analysis: This behavior is due to missing code for the DPC
> switch/bridge. The existing driver depends on pci_hp_add_devices()
> function for device enablement. This function calls pci_scan_slot() on
> only one device-node/port of the bridge, not on all the siblings'
> device-node/port.
>
> The missing code needs to be added which will find all the sibling
> device-nodes/bridge-ports and will run explicit pci_scan_slot() on
> those.  A new function has been added for this purpose which gets
> invoked from pci_hp_add_devices(). This new function
> pci_traverse_sibling_nodes_and_scan_slot() gets all the sibling
> bridge-ports by traversal and explicitly invokes pci_scan_slot on them.
>
>
> Signed-off-by: Krishna Kumar <krishnak@...ux.ibm.com>
> ---
>
> Command for reproducing the issue :
>
> For hot unplug/disable - echo 0 > /sys/bus/pci/slots/C5/power
> For hot plug/enable -    echo 1 > /sys/bus/pci/slots/C5/power
>
> where C5 is slot associated with bridge.
>
> Scenario/Tests:
> Output of lspci -nn before test is given below. This snippet contains
> devices used for testing on Powernv machine.
>
> 0004:02:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052]
> 0004:02:01.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052]
> 0004:02:02.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052]
> 0004:02:03.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052]
> 0004:08:00.0 Serial Attached SCSI controller [0107]:
> Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01)
> 0004:09:00.0 Serial Attached SCSI controller [0107]:
> Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01)
>
> Output of lspci -tv before test is as follows:
>
> # lspci -tv
>   +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]--
>   |                           |               +-01.0-[08]----00.0  Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3
>   |                           |               +-02.0-[09]----00.0  Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3
>   |                           |               \-03.0-[0a-0e]--
>   |                           \-00.1  PMC-Sierra Inc. Device 4052
>
> C5(bridge) and C6(End Point) slot address are as below:
> # cat /sys/bus/pci/slots/C5/address
> 0004:02:00
> # cat /sys/bus/pci/slots/C6/address
> 0004:09:00
>
> Hot-unplug operation on slot associated with bridge:
> # echo 0 > /sys/bus/pci/slots/C5/power
> # lspci -tv
>   +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--
>   |                           \-00.1  PMC-Sierra Inc. Device 4052
>
>  From the above lspci -tv output, it can be observed that hot unplug
> operation has removed all the PMC-Sierra bridge ports like:
> 00.0-[03-07], 01.0-[08], 02.0-[09], 03.0-[0a-0e] and the SAS devices
> behind the bridge-port. Without the fix, when the hot plug operation is
> done on the same slot, it adds only the first bridge port and doesn't
> restore all the bridge-ports and devices that it unplugged earlier.
> Below snippet shows this.
>
> Hot-plug operation on the bridge slot without the fix:
> # echo 1 > /sys/bus/pci/slots/C5/power
> # lspci -tv
>   +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]--
>   |                           \-00.1  PMC-Sierra Inc. Device 4052
>
> After the fix, it restores all the devices in the same manner how it
> unplugged them earlier during the hot unplug operation. The below snippet
> shows the same.
> Hot-plug operation on bridge slot with the fix:
> # echo 1 > /sys/bus/pci/slots/C5/power
> # lspci -tv
>   +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]--
>   |                           |               +-01.0-[08]----00.0  Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3
>   |                           |               +-02.0-[09]----00.0  Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3
>   |                           |               \-03.0-[0a-0e]--
>   |                           \-00.1  PMC-Sierra Inc. Device 4052
>
> Removal of End point device behind bridge are also intact and behaving
> correctly.
> Hot-unplug operation on Endpoint device C6:
> # echo 0 > /sys/bus/pci/slots/C6/power
> # lspci -tv
>   +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]--
>   |                           |               +-01.0-[08]----00.0  Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3
>   |                           |               +-02.0-[09]--
>   |                           |               \-03.0-[0a-0e]--
>   |                           \-00.1  PMC-Sierra Inc. Device 4052
>
> Hot-plug operation on Endpoint device C6:
> # echo 1 > /sys/bus/pci/slots/C6/power
> # lspci -tv
>   +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]--
>   |                           |               +-01.0-[08]----00.0  Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3
>   |                           |               +-02.0-[09]----00.0  Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3
>   |                           |               \-03.0-[0a-0e]--
>   |                           \-00.1  PMC-Sierra Inc. Device 4052
>
>
>
>   arch/powerpc/include/asm/ppc-pci.h |  4 +++
>   arch/powerpc/kernel/pci-hotplug.c  |  5 ++--
>   arch/powerpc/kernel/pci_dn.c       | 42 ++++++++++++++++++++++++++++++
>   3 files changed, 48 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h
> index a8b7e8682f5b..a5d5ee4ff7c0 100644
> --- a/arch/powerpc/include/asm/ppc-pci.h
> +++ b/arch/powerpc/include/asm/ppc-pci.h
> @@ -28,6 +28,10 @@ struct pci_dn;
>   void *pci_traverse_device_nodes(struct device_node *start,
>   				void *(*fn)(struct device_node *, void *),
>   				void *data);
> +
> +void *pci_traverse_sibling_nodes_and_scan_slot(struct device_node *start,
> +					       struct pci_bus *bus);
> +
>   extern void pci_devs_phb_init_dynamic(struct pci_controller *phb);
>   
>   #if defined(CONFIG_IOMMU_API) && (defined(CONFIG_PPC_PSERIES) || \
> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
> index 0fe251c6ac2c..639a3d592fe2 100644
> --- a/arch/powerpc/kernel/pci-hotplug.c
> +++ b/arch/powerpc/kernel/pci-hotplug.c
> @@ -106,7 +106,7 @@ EXPORT_SYMBOL_GPL(pci_hp_remove_devices);
>    */
>   void pci_hp_add_devices(struct pci_bus *bus)
>   {
> -	int slotno, mode, max;
> +	int mode, max;
>   	struct pci_dev *dev;
>   	struct pci_controller *phb;
>   	struct device_node *dn = pci_bus_to_OF_node(bus);
> @@ -129,8 +129,7 @@ void pci_hp_add_devices(struct pci_bus *bus)
>   		 * order for fully rescan all the way down to pick them up.
>   		 * They can have been removed during partial hotplug.
>   		 */
> -		slotno = PCI_SLOT(PCI_DN(dn->child)->devfn);
> -		pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
> +		pci_traverse_sibling_nodes_and_scan_slot(dn, bus);
>   		max = bus->busn_res.start;
>   		/*
>   		 * Scan bridges that are already configured. We don't touch
> diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
> index 38561d6a2079..2e202f9cec21 100644
> --- a/arch/powerpc/kernel/pci_dn.c
> +++ b/arch/powerpc/kernel/pci_dn.c
> @@ -493,4 +493,46 @@ static void pci_dev_pdn_setup(struct pci_dev *pdev)
>   	pdn = pci_get_pdn(pdev);
>   	pdev->dev.archdata.pci_data = pdn;
>   }
> +
> +void *pci_traverse_sibling_nodes_and_scan_slot(struct device_node *start, struct pci_bus *bus)

Two things regarding the return type of the above function.

1. Function only returns NULL
2. Caller of this function doesn't take any action based on the return 
value of this function.

How about changing the return type from void * to just void?

> +{
> +	struct device_node *dn;
> +	struct device_node *parent;

parent variable is not really required.

> +	int slotno;
> +
> +	const __be32 *classp1;
> +	u32 class1 = 0;
> +
> +	classp1 = of_get_property(start->child, "class-code", NULL);
> +	if (classp1)
> +		class1 = of_read_number(classp1, 1);
> +
> +	/* Call of pci_scan_slot for non-bridge/EP case */
> +	if (!((class1 >> 8) == PCI_CLASS_BRIDGE_PCI)) {
> +		slotno = PCI_SLOT(PCI_DN(start->child)->devfn);
> +		pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
> +		return NULL;
> +	}
> +
> +	/* Iterate all siblings */
> +	parent = start;
> +	for_each_child_of_node(parent, dn) {
> +		const __be32 *classp;
> +		u32 class = 0;
> +
> +		classp = of_get_property(dn, "class-code", NULL);
> +		if (classp)
> +			class = of_read_number(classp, 1);
> +
> +		/* Call of pci_scan_slot on each sibling-nodes/bridge-ports */
> +		if ((class >> 8) == PCI_CLASS_BRIDGE_PCI) {
> +			slotno = PCI_SLOT(PCI_DN(dn)->devfn);
> +			pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
> +		}
> +	}
> +
> +	return NULL;

Some level of code duplication can be avoided if we push both cases
1. Call of pci_scan_slot for non-bridge/EP case
2. Call of pci_scan_slot for bridge port

inside `for_each_child_of_node` macro.

Something like this.

     u32 class;
     int slotno;
     const __be32 *classp;
     struct device_node *dn;

     for_each_child_of_node(start, dn) {
         class = 0;
         classp = of_get_property(dn, "class-code", NULL);
         if (classp)
             class = of_read_number(classp, 1);

         /* Call of pci_scan_slot for non-bridge/EP case */
         if (!((class >> 8) == PCI_CLASS_BRIDGE_PCI) && start->child == 
dn) {
             slotno = PCI_SLOT(PCI_DN(dn)->devfn);
             pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
             of_node_put(dn);
             return;
         }

         /* Call of pci_scan_slot for bridge port */
         if ((class >> 8) == PCI_CLASS_BRIDGE_PCI) {
             slotno = PCI_SLOT(PCI_DN(dn)->devfn);
             pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
         }
     }


> +}
> +EXPORT_SYMBOL_GPL(pci_traverse_sibling_nodes_and_scan_slot);

What is the need for exporting the above function?

> +
>   DECLARE_PCI_FIXUP_EARLY(PCI_ANY_ID, PCI_ANY_ID, pci_dev_pdn_setup);

- Sourabh Jain

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ