lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 26 Oct 2012 14:59:21 +0800
From:	Cyberman Wu <cypher.w@...il.com>
To:	linux-pci@...r.kernel.org
Cc:	linux-kernel@...r.kernel.org
Subject: PCIe IO space support on Tilera GX: Is there any one who can confirm
 my modification to fix it is OK?

After we upgrade to MDE 4.1.0 from Tilera, we encounter a problem that
only on HighPoint 2680 card works, I've
tried to fix it, but since most time I'm working in user space, I'm
not sure my fix is enough. Their FAE said that
the guy who add PCIe I/O space support is on vacation and I can't get
help from him now, I hope maybe there
will have somebody can help.


Problem we encountered:

pci 0000:00:00.0: BAR 8: assigned [mem 0x100c0000000-0x100c00fffff]
pci 0000:00:00.0: BAR 9: assigned [mem 0x100c0100000-0x100c01fffff pref]
pci 0000:00:00.0: BAR 7: assigned [io  0x0000-0x0fff]
pci 0000:01:00.0: BAR 6: assigned [mem 0x100c0100000-0x100c013ffff pref]
pci 0000:01:00.0: BAR 6: set to [mem 0x100c0100000-0x100c013ffff pref]
(PCI address [0xc0100000-0xc013ffff])
pci 0000:01:00.0: BAR 4: assigned [mem 0x100c0000000-0x100c000ffff 64bit]
pci 0000:01:00.0: BAR 4: set to [mem 0x100c0000000-0x100c000ffff
64bit] (PCI address [0xc0000000-0xc000ffff])
pci 0000:01:00.0: BAR 2: assigned [io  0x0000-0x007f]
pci 0000:01:00.0: BAR 2: set to [io  0x0000-0x007f] (PCI address [0x0-0x7f])
pci 0000:00:00.0: PCI bridge to [bus 01-01]
pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:00.0:   bridge window [mem 0x100c0000000-0x100c00fffff]
pci 0000:00:00.0:   bridge window [mem 0x100c0100000-0x100c01fffff pref]
pci 0001:00:00.0: BAR 8: assigned [mem 0x101c0000000-0x101c00fffff]
pci 0001:00:00.0: BAR 9: assigned [mem 0x101c0100000-0x101c01fffff pref]
pci 0001:00:00.0: BAR 7: assigned [io  0x0000-0x0fff]
pci 0001:01:00.0: BAR 6: assigned [mem 0x101c0100000-0x101c013ffff pref]
pci 0001:01:00.0: BAR 6: set to [mem 0x101c0100000-0x101c013ffff pref]
(PCI address [0xc0100000-0xc013ffff])
pci 0001:01:00.0: BAR 4: assigned [mem 0x101c0000000-0x101c000ffff 64bit]
pci 0001:01:00.0: BAR 4: set to [mem 0x101c0000000-0x101c000ffff
64bit] (PCI address [0xc0000000-0xc000ffff])
pci 0001:01:00.0: BAR 2: assigned [io  0x0000-0x007f]
pci 0001:01:00.0: BAR 2: set to [io  0x0000-0x007f] (PCI address [0x0-0x7f])
pci 0001:00:00.0: PCI bridge to [bus 01-01]
pci 0001:00:00.0:   bridge window [io  0x0000-0x0fff]
pci 0001:00:00.0:   bridge window [mem 0x101c0000000-0x101c00fffff]
pci 0001:00:00.0:   bridge window [mem 0x101c0100000-0x101c01fffff pref]
pci 0000:00:00.0: enabling device (0006 -> 0007)
pci 0001:00:00.0: enabling device (0006 -> 0007)
pci_bus 0000:00: resource 0 [io  0x0000-0xffffffff]
pci_bus 0000:00: resource 1 [mem 0x100c0000000-0x100ffffffff]
pci_bus 0000:01: resource 0 [io  0x0000-0x0fff]
pci_bus 0000:01: resource 1 [mem 0x100c0000000-0x100c00fffff]
pci_bus 0000:01: resource 2 [mem 0x100c0100000-0x100c01fffff pref]
pci_bus 0001:00: resource 0 [io  0x0000-0xffffffff]
pci_bus 0001:00: resource 1 [mem 0x101c0000000-0x101ffffffff]
pci_bus 0001:01: resource 0 [io  0x0000-0x0fff]
pci_bus 0001:01: resource 1 [mem 0x101c0000000-0x101c00fffff]
pci_bus 0001:01: resource 2 [mem 0x101c0100000-0x101c01fffff pref]
......
mvsas 0000:01:00.0: mvsas: driver version 0.8.2
mvsas 0000:01:00.0: enabling device (0000 -> 0003)
mvsas 0000:01:00.0: enabling bus mastering
mvsas 0000:01:00.0: mvsas: PCI-E x4, Bandwidth Usage: 2.5 Gbps
mvsas 0000:01:00.0: Phy3 : No sig fis
scsi0 : mvsas
......
mvsas 0001:01:00.0: mvsas: driver version 0.8.2
mvsas 0001:01:00.0: enabling device (0000 -> 0003)
mvsas 0001:01:00.0: enabling bus mastering
mvsas 0001:01:00.0: BAR 2: can't reserve [io  0x0000-0x007f]
mvsas: probe of 0001:01:00.0 failed with error -16


My modification:

--- /opt/tilera/TileraMDE-4.1.0.148119/tilegx/src/linux-2.6.40.38/arch/tile/kernel/pci_gx.c	2012-10-22
14:56:59.783096378 +0800
+++ Tilera_src/src/linux-2.6.40.38/arch/tile/kernel/pci_gx.c	2012-10-26
13:55:02.731947886 +0800
@@ -368,6 +368,10 @@
 	int num_trio_shims = 0;
 	int ctl_index = 0;
 	int i, j;
+    // Modified by Cyberman Wu on Oct 25th, 2012.
+	resource_size_t io_mem_start;
+	resource_size_t io_mem_end;
+	resource_size_t io_mem_size;

 	if (!pci_probe) {
 		pr_info("PCI: disabled by boot argument\n");
@@ -457,6 +461,18 @@
 	}

 out:
+	// Use IO memory space 0~0xffffffff for every controller will
+	// cause device on controller other than the first failed to
+	// load driver if it using IO regions.
+	// Is reserve the first 4K IO address space OK? Tilera use
+	// IO space address begin from 0, but some drivers in Linux
+	// recognize 0 address a error, say, mvsas, so for compatiblity
+	// reserve some address from 0 should be better?
+	// Modified by Cyberman Wu on Oct 25th, 2012.
+	io_mem_start = 4096;
+	io_mem_end = (resource_size_t)IO_SPACE_LIMIT + 1;
+	io_mem_size = (io_mem_end - io_mem_start) / num_rc_controllers;
+	io_mem_size &= ~3;
 	/*
 	 * Configure each PCIe RC port.
 	 */
@@ -470,8 +486,9 @@
 		controller->index = i;
 		controller->ops = &tile_cfg_ops;

-		controller->io_space.start = 0;
-		controller->io_space.end = IO_SPACE_LIMIT;
+		// Modified by Cyberman Wu on Oct 25th, 2012.
+		controller->io_space.start = io_mem_start + (i * io_mem_size);
+		controller->io_space.end = controller->io_space.start + io_mem_size - 1;
 		controller->io_space.flags = IORESOURCE_IO;
 		snprintf(controller->io_space_name,
 			 sizeof(controller->io_space_name),


Please note that we're using MDE-4.1.0, which use kernel 3.0.38, patch
it and reversion it
to 2.6.40.38.
I've checked source code under arch/tile of kernel 3.6.3 and PCIe I/O
space support is still
not here. Below is diff of arch/tile/pci_gx.c between kernel 3.6.3 and
MDE-4.1.0:

--- .cache/.fr-9Oo37J/linux-3.6.3/arch/tile/kernel/pci_gx.c	2012-10-22
00:32:56.000000000 +0800
+++ /opt/tilera/TileraMDE-4.1.0.148119/tilegx/src/linux-2.6.40.38/arch/tile/kernel/pci_gx.c	2012-10-22
14:56:59.783096378 +0800
@@ -69,19 +69,18 @@
  * a HW PCIe link-training bug. The exact delay is specified with
  * a kernel boot argument in the form of "pcie_rc_delay=T,P,S",
  * where T is the TRIO instance number, P is the port number and S is
- * the delay in seconds. If the delay is not provided, the value
- * will be DEFAULT_RC_DELAY.
+ * the delay in seconds. If the argument is specified, but the delay is
+ * not provided, the value will be DEFAULT_RC_DELAY.
  */
 static int __devinitdata rc_delay[TILEGX_NUM_TRIO][TILEGX_TRIO_PCIES];

 /* Default number of seconds that the PCIe RC port probe can be delayed. */
 #define DEFAULT_RC_DELAY	10

-/* Max number of seconds that the PCIe RC port probe can be delayed. */
-#define MAX_RC_DELAY		20
-
+#if !defined(GX_FPGA)
 /* Array of the PCIe ports configuration info obtained from the BIB. */
 struct pcie_port_property pcie_ports[TILEGX_NUM_TRIO][TILEGX_TRIO_PCIES];
+#endif

 /* All drivers share the TRIO contexts defined here. */
 gxio_trio_context_t trio_contexts[TILEGX_NUM_TRIO];
@@ -97,6 +96,41 @@
 static struct cpumask intr_cpus_map;

 /*
+ * Convert a resource to a PCI device bus address or bus window.
+ */
+void __devinit
+pcibios_resource_to_bus(struct pci_dev *dev, struct pci_bus_region *region,
+			struct resource *res)
+{
+	struct pci_controller *controller =
+		(struct pci_controller *)dev->sysdata;
+	unsigned long offset = 0;
+
+	if (res->flags & IORESOURCE_MEM)
+		offset = controller->mem_offset;
+
+	region->start = res->start - offset;
+	region->end = res->end - offset;
+}
+EXPORT_SYMBOL(pcibios_resource_to_bus);
+
+void __devinit
+pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res,
+			struct pci_bus_region *region)
+{
+	struct pci_controller *controller =
+		(struct pci_controller *)dev->sysdata;
+	unsigned long offset = 0;
+
+	if (res->flags & IORESOURCE_MEM)
+		offset = controller->mem_offset;
+
+	res->start = region->start + offset;
+	res->end = region->end + offset;
+}
+EXPORT_SYMBOL(pcibios_bus_to_resource);
+
+/*
  * We don't need to worry about the alignment of resources.
  */
 resource_size_t pcibios_align_resource(void *data, const struct resource *res,
@@ -274,6 +308,10 @@

 	cpumask_copy(&intr_cpus_map, cpu_online_mask);

+#ifdef CONFIG_DATAPLANE
+	/* Remove dataplane cpus. */
+	cpumask_andnot(&intr_cpus_map, &intr_cpus_map, &dataplane_map);
+#endif

 	for (i = 0; i < 4; i++) {
 		gxio_trio_context_t *context = controller->trio;
@@ -325,7 +363,7 @@
  *
  * Returns the number of controllers discovered.
  */
-int __init tile_pci_init(void)
+int __devinit tile_pci_init(void)
 {
 	int num_trio_shims = 0;
 	int ctl_index = 0;
@@ -359,6 +397,7 @@
 	 * We look at the Board Information Block first and then see if there
 	 * are any overriding configuration by the HW strapping pin.
 	 */
+#if !defined(GX_FPGA)
 	for (i = 0; i < TILEGX_NUM_TRIO; i++) {
 		gxio_trio_context_t *context = &trio_contexts[i];
 		int ret;
@@ -386,6 +425,13 @@
 			}
 		}
 	}
+#else
+	/*
+	 * For now, just assume that there is a single RC port on trio/0.
+	 */
+	num_rc_controllers = 1;
+	pcie_rc[0][2] = 1;
+#endif

 	/*
 	 * Return if no PCIe ports are configured to operate in RC mode.
@@ -424,13 +470,20 @@
 		controller->index = i;
 		controller->ops = &tile_cfg_ops;

+		controller->io_space.start = 0;
+		controller->io_space.end = IO_SPACE_LIMIT;
+		controller->io_space.flags = IORESOURCE_IO;
+		snprintf(controller->io_space_name,
+			 sizeof(controller->io_space_name),
+			 "PCI I/O domain %d", i);
+		controller->io_space.name = controller->io_space_name;
+
 		/*
 		 * The PCI memory resource is located above the PA space.
 		 * For every host bridge, the BAR window or the MMIO aperture
 		 * is in range [3GB, 4GB - 1] of a 4GB space beyond the
 		 * PA space.
 		 */
-
 		controller->mem_offset = TILE_PCI_MEM_START +
 			(i * TILE_PCI_BAR_WINDOW_TOP);
 		controller->mem_space.start = controller->mem_offset +
@@ -451,7 +504,7 @@
  * (pin - 1) converts from the PCI standard's [1:4] convention to
  * a normal [0:3] range.
  */
-static int tile_map_irq(const struct pci_dev *dev, u8 device, u8 pin)
+static int tile_map_irq(struct pci_dev *dev, u8 device, u8 pin)
 {
 	struct pci_controller *controller =
 		(struct pci_controller *)dev->sysdata;
@@ -463,11 +516,12 @@
 						controller)
 {
 	gxio_trio_context_t *trio_context = controller->trio;
-	struct pci_bus *root_bus = controller->root_bus;
 	TRIO_PCIE_RC_DEVICE_CONTROL_t dev_control;
 	TRIO_PCIE_RC_DEVICE_CAP_t rc_dev_cap;
+	unsigned int smallest_max_payload;
+	struct pci_dev *dev = NULL;
 	unsigned int reg_offset;
-	struct pci_bus *child;
+	u16 new_values;
 	int mac;
 	int err;

@@ -508,33 +562,59 @@
 	__gxio_mmio_write32(trio_context->mmio_base_mac + reg_offset,
 						rc_dev_cap.word);

-	/* Configure PCI Express MPS setting. */
-	list_for_each_entry(child, &root_bus->children, node) {
-		struct pci_dev *self = child->self;
-		if (!self)
+	smallest_max_payload = rc_dev_cap.mps_sup;
+
+	/* Scan for the smallest maximum payload size. */
+	while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
+		int pcie_caps_offset;
+		u32 devcap;
+		int max_payload;
+
+		/* Skip device that is not in this PCIe domain. */
+		if ((struct pci_controller *)dev->sysdata != controller)
 			continue;

-		pcie_bus_configure_settings(child, self->pcie_mpss);
+		pcie_caps_offset = pci_find_capability(dev, PCI_CAP_ID_EXP);
+		if (pcie_caps_offset == 0)
+			continue;
+
+		pci_read_config_dword(dev, pcie_caps_offset + PCI_EXP_DEVCAP,
+				      &devcap);
+		max_payload = devcap & PCI_EXP_DEVCAP_PAYLOAD;
+		if (max_payload < smallest_max_payload)
+			smallest_max_payload = max_payload;
+	}
+
+	/* Now, set the max_payload_size for all devices to that value. */
+	new_values = smallest_max_payload << 5;
+	while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
+		int pcie_caps_offset;
+		u16 devctl;
+
+		/* Skip device that is not in this PCIe domain. */
+		if ((struct pci_controller *)dev->sysdata != controller)
+			continue;
+
+		pcie_caps_offset = pci_find_capability(dev, PCI_CAP_ID_EXP);
+		if (pcie_caps_offset == 0)
+			continue;
+
+		pci_read_config_word(dev, pcie_caps_offset + PCI_EXP_DEVCTL,
+				     &devctl);
+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
+		devctl |= new_values;
+		pci_write_config_word(dev, pcie_caps_offset + PCI_EXP_DEVCTL,
+				      devctl);
 	}

 	/*
 	 * Set the mac_config register in trio based on the MPS/MRS of the link.
 	 */
-	reg_offset =
-		(TRIO_PCIE_RC_DEVICE_CONTROL <<
-			TRIO_CFG_REGION_ADDR__REG_SHIFT) |
-		(TRIO_CFG_REGION_ADDR__INTFC_VAL_MAC_STANDARD <<
-			TRIO_CFG_REGION_ADDR__INTFC_SHIFT ) |
-		(mac << TRIO_CFG_REGION_ADDR__MAC_SEL_SHIFT);
-
-	dev_control.word = __gxio_mmio_read32(trio_context->mmio_base_mac +
-						reg_offset);
-
 	err = gxio_trio_set_mps_mrs(trio_context,
-				    dev_control.max_payload_size,
+				    smallest_max_payload,
 				    dev_control.max_read_req_sz,
 				    mac);
-        if (err < 0) {
+	if (err < 0) {
 		pr_err("PCI: PCIE_CONFIGURE_MAC_MPS_MRS failure, "
 			"MAC %d on TRIO %d\n",
 			mac, controller->trio_index);
@@ -571,14 +651,9 @@
 		if (!isdigit(*str))
 			return -EINVAL;
 		delay = simple_strtoul(str, (char **)&str, 10);
-		if (delay > MAX_RC_DELAY)
-			return -EINVAL;
 	}

 	rc_delay[trio_index][mac] = delay ? : DEFAULT_RC_DELAY;
-	pr_info("Delaying PCIe RC link training for %u sec"
-		" on MAC %lu on TRIO %lu\n", rc_delay[trio_index][mac],
-		mac, trio_index);
 	return 0;
 }
 early_param("pcie_rc_delay", setup_pcie_rc_delay);
@@ -586,18 +661,14 @@
 /*
  * PCI initialization entry point, called by subsys_initcall.
  */
-int __init pcibios_init(void)
+int __devinit pcibios_init(void)
 {
 	resource_size_t offset;
-	LIST_HEAD(resources);
 	int next_busno;
 	int i;

 	tile_pci_init();

-	if (num_rc_controllers == 0 && num_ep_controllers == 0)
-		return 0;
-
 	/*
 	 * We loop over all the TRIO shims and set up the MMIO mappings.
 	 */
@@ -623,6 +694,9 @@
 		}
 	}

+	if (num_rc_controllers == 0 && num_ep_controllers == 0)
+		return 0;
+
 	/*
 	 * Delay a bit in case devices aren't ready.  Some devices are
 	 * known to require at least 20ms here, but we use a more
@@ -684,15 +758,36 @@
 		}

 		/*
-		 * Delay the RC link training if needed.
+		 * Delay the bus probe if needed.
 		 */
-		if (rc_delay[trio_index][mac])
+		if (rc_delay[trio_index][mac]) {
+			pr_info("Delaying PCIe RC link training for %d sec"
+				" on MAC %d on TRIO %d\n",
+				rc_delay[trio_index][mac], mac,
+				trio_index);
 			msleep(rc_delay[trio_index][mac] * 1000);
+		}

-		ret = gxio_trio_force_rc_link_up(trio_context, mac);
-		if (ret < 0)
-			pr_err("PCI: PCIE_FORCE_LINK_UP failure, "
-				"MAC %d on TRIO %d\n", mac, trio_index);
+		/*
+		 * Check for PCIe link-up status to decide if we need
+		 * to force the link to come up.
+		 */
+		reg_offset =
+			(TRIO_PCIE_INTFC_PORT_STATUS <<
+				TRIO_CFG_REGION_ADDR__REG_SHIFT) |
+			(TRIO_CFG_REGION_ADDR__INTFC_VAL_MAC_INTERFACE <<
+				TRIO_CFG_REGION_ADDR__INTFC_SHIFT ) |
+			(mac << TRIO_CFG_REGION_ADDR__MAC_SEL_SHIFT);
+
+		port_status.word =
+			__gxio_mmio_read(trio_context->mmio_base_mac +
+					 reg_offset);
+		if (!port_status.dl_up) {
+			ret = gxio_trio_force_rc_link_up(trio_context, mac);
+			if (ret < 0)
+				pr_err("PCI: PCIE_FORCE_LINK_UP failure, "
+					"MAC %d on TRIO %d\n", mac, trio_index);
+		}

 		pr_info("PCI: Found PCI controller #%d on TRIO %d MAC %d\n", i,
 			trio_index, controller->mac);
@@ -704,22 +799,20 @@
 		msleep(1000);

 		/*
-		 * Check for PCIe link-up status.
+		 * Check for PCIe link-up status again.
 		 */
-
-		reg_offset =
-			(TRIO_PCIE_INTFC_PORT_STATUS <<
-				TRIO_CFG_REGION_ADDR__REG_SHIFT) |
-			(TRIO_CFG_REGION_ADDR__INTFC_VAL_MAC_INTERFACE <<
-				TRIO_CFG_REGION_ADDR__INTFC_SHIFT ) |
-			(mac << TRIO_CFG_REGION_ADDR__MAC_SEL_SHIFT);
-
 		port_status.word =
 			__gxio_mmio_read(trio_context->mmio_base_mac +
 					 reg_offset);
 		if (!port_status.dl_up) {
-			pr_err("PCI: link is down, MAC %d on TRIO %d\n",
-				mac, trio_index);
+			if (pcie_ports[trio_index][mac].removable) {
+				pr_info("PCI: link is down, MAC %d on TRIO %d",
+					mac, trio_index);
+				pr_info("This is expected if no PCIe card"
+					" is connected to this link");
+			} else
+				pr_err("PCI: link is down, MAC %d on TRIO %d",
+					mac, trio_index);
 			continue;
 		}

@@ -842,19 +935,22 @@
 		}

 		/*
-		 * The PCI memory resource is located above the PA space.
-		 * The memory range for the PCI root bus should not overlap
-		 * with the physical RAM
+		 * This comes from the generic Linux PCI driver.
+		 *
+		 * It reads the PCI tree for this bus into the Linux
+		 * data structures.
+		 *
+		 * This is inlined in linux/pci.h and calls into
+		 * pci_scan_bus_parented() in probe.c.
 		 */
-		pci_add_resource_offset(&resources, &controller->mem_space,
-					controller->mem_offset);
-
-		controller->first_busno = next_busno;
-		bus = pci_scan_root_bus(NULL, next_busno, controller->ops,
-					controller, &resources);
+		controller->first_busno= next_busno;
+		bus = pci_scan_bus(next_busno, controller->ops, controller);
 		controller->root_bus = bus;
-		next_busno = bus->busn_res.end + 1;
-
+#if 0
+		next_busno = bus->subordinate + 1;
+#else
+		next_busno = 0;
+#endif
 	}

 	/* Do machine dependent PCI interrupt routing */
@@ -951,6 +1047,37 @@
 		}

 		/*
+		 * Alloc a PIO region for PCI I/O space access for each RC port.
+		 */
+		ret = gxio_trio_alloc_pio_regions(trio_context, 1, 0, 0);
+		if (ret < 0) {
+			pr_err("PCI: I/O PIO alloc failure on TRIO %d mac %d, "
+				"give up\n", controller->trio_index,
+				controller->mac);
+
+			continue;
+		}
+
+		controller->pio_io_index = ret;
+
+		/*
+		 * For PIO IO, the bus_address_hi parameter is hard-coded 0
+		 * because PCI I/O address space is 32-bit.
+		 */
+		ret = gxio_trio_init_pio_region_aux(trio_context,
+						    controller->pio_io_index,
+						    controller->mac,
+						    0,
+						    HV_TRIO_PIO_FLAG_IO_SPACE);
+		if (ret < 0) {
+			pr_err("PCI: I/O PIO init failure on TRIO %d mac %d, "
+				"give up\n", controller->trio_index,
+				controller->mac);
+
+			continue;
+		}
+
+		/*
 		 * Configure a Mem-Map region for each memory controller so
 		 * that Linux can map all of its PA space to the PCI bus.
 		 * Use the IOMMU to handle hash-for-home memory.
@@ -1015,9 +1142,22 @@
 }
 subsys_initcall(pcibios_init);

-/* Note: to be deleted after Linux 3.6 merge. */
+/*
+ * PCI scan code calls the arch specific pcibios_fixup_bus() each time it scans
+ * a new bridge. Called after each bus is probed, but before its children are
+ * examined.
+ */
 void __devinit pcibios_fixup_bus(struct pci_bus *bus)
 {
+	struct pci_dev *dev = bus->self;
+
+	if (!dev) {
+		struct pci_controller *controller = bus->sysdata;
+
+		/* This is the root bus. */
+		bus->resource[0] = &controller->io_space;
+		bus->resource[1] = &controller->mem_space;
+	}
 }

 /*
@@ -1043,8 +1183,7 @@

 /*
  * Enable memory address decoding, as appropriate, for the
- * device described by the 'dev' struct. The I/O decoding
- * is disabled, though the TILE-Gx supports I/O addressing.
+ * device described by the 'dev' struct.
  *
  * This is called from the generic PCI layer, and can be called
  * for bridges or endpoints.
@@ -1126,10 +1265,95 @@
 	 * We need to keep the PCI bus address's in-page offset in the VA.
 	 */
 	return iorpc_ioremap(trio_fd, offset, size) +
-		(phys_addr & (PAGE_SIZE - 1));
+		(start & (PAGE_SIZE - 1));
 }
 EXPORT_SYMBOL(ioremap);

+/* Map a PCI I/O address into VA space. */
+void __iomem *ioport_map(unsigned long port, unsigned int size)
+{
+	struct pci_controller *controller = NULL;
+	resource_size_t bar_start;
+	resource_size_t bar_end;
+	resource_size_t offset;
+	resource_size_t start;
+	resource_size_t end;
+	int trio_fd;
+	int i;
+
+	start = port;
+	end = port + size - 1;
+
+	/*
+	 * In the following, each PCI controller's mem_resources[0]
+	 * represents its PCI I/O resource. By searching port in each
+	 * controller's mem_resources[0], we can determine the controller
+	 * that should accept the PCI I/O access.
+	 */
+
+	for (i = 0; i < num_rc_controllers; i++) {
+		/*
+		 * Skip controllers that are not properly initialized or
+		 * have down links.
+		 */
+		if (pci_controllers[i].root_bus == NULL)
+			continue;
+
+		bar_start = pci_controllers[i].mem_resources[0].start;
+		bar_end = pci_controllers[i].mem_resources[0].end;
+
+		if ((start >= bar_start) && (end <= bar_end)) {
+
+			controller = &pci_controllers[i];
+
+			goto got_it;
+		}
+	}
+
+	if (controller == NULL)
+		return NULL;
+
+got_it:
+	trio_fd = controller->trio->fd;
+
+	offset = HV_TRIO_PIO_OFFSET(controller->pio_io_index) + port;
+
+	/*
+	 * We need to keep the PCI bus address's in-page offset in the VA.
+	 */
+	return iorpc_ioremap(trio_fd, offset, size) + (port & (PAGE_SIZE - 1));
+}
+EXPORT_SYMBOL(ioport_map);
+
+void ioport_unmap(void __iomem *addr)
+{
+	iounmap(addr);
+}
+EXPORT_SYMBOL(ioport_unmap);
+
+/*
+ * Create a virtual mapping cookie for a PCI BAR (memory or IO).
+ */
+void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max)
+{
+	resource_size_t start = pci_resource_start(dev, bar);
+	resource_size_t len = pci_resource_len(dev, bar);
+	unsigned long flags = pci_resource_flags(dev, bar);
+
+	if (!len)
+		return NULL;
+	if (max && len > max)
+		len = max;
+	if (flags & IORESOURCE_IO)
+		return ioport_map(start, len);
+	if (flags & IORESOURCE_MEM)
+		return ioremap(start, len);
+
+	pr_err("PCI: Trying to map invalid resource %#lx\n", flags);
+	return NULL;
+}
+EXPORT_SYMBOL(pci_iomap);
+
 void pci_iounmap(struct pci_dev *dev, void __iomem *addr)
 {
 	iounmap(addr);
@@ -1478,32 +1702,55 @@
 	trio_context = controller->trio;

 	/*
-	 * Allocate the Mem-Map that will accept the MSI write and
-	 * trigger the TILE-side interrupts.
-	 */
-	mem_map = gxio_trio_alloc_memory_maps(trio_context, 1, 0, 0);
-	if (mem_map < 0) {
-		dev_printk(KERN_INFO, &pdev->dev,
-			"%s Mem-Map alloc failure. "
-			"Failed to initialize MSI interrupts. "
-			"Falling back to legacy interrupts.\n",
-			desc->msi_attrib.is_msix ? "MSI-X" : "MSI");
+	 * Allocate a scatter-queue that will accept the MSI write and
+	 * trigger the TILE-side interrupts. We use the scatter-queue regions
+	 * before the mem map regions, because the latter are needed by more
+	 * applications.
+	 */
+	mem_map = gxio_trio_alloc_scatter_queues(trio_context, 1, 0, 0);
+	if (mem_map >= 0) {
+		TRIO_MAP_SQ_DOORBELL_FMT_t doorbell_template = {{
+			.pop = 0,
+			.doorbell = 1,
+		}};
+
+		mem_map += TRIO_NUM_MAP_MEM_REGIONS;
+		mem_map_base = MEM_MAP_INTR_REGIONS_BASE +
+			mem_map * MEM_MAP_INTR_REGION_SIZE;
+		mem_map_limit = mem_map_base + MEM_MAP_INTR_REGION_SIZE - 1;
+
+		msi_addr = mem_map_base + MEM_MAP_INTR_REGION_SIZE - 8;
+		msg.data = (unsigned int)doorbell_template.word;
+	} else {
+		/* SQ regions are out, allocate from map mem regions. */
+		mem_map = gxio_trio_alloc_memory_maps(trio_context, 1, 0, 0);
+		if (mem_map < 0) {
+			dev_printk(KERN_INFO, &pdev->dev,
+				"%s Mem-Map alloc failure. "
+				"Failed to initialize MSI interrupts. "
+				"Falling back to legacy interrupts.\n",
+				desc->msi_attrib.is_msix ? "MSI-X" : "MSI");
+			ret = -ENOMEM;
+			goto msi_mem_map_alloc_failure;
+		}

-		ret = -ENOMEM;
-		goto msi_mem_map_alloc_failure;
+		mem_map_base = MEM_MAP_INTR_REGIONS_BASE +
+			mem_map * MEM_MAP_INTR_REGION_SIZE;
+		mem_map_limit = mem_map_base + MEM_MAP_INTR_REGION_SIZE - 1;
+
+		msi_addr = mem_map_base + TRIO_MAP_MEM_REG_INT3 -
+			TRIO_MAP_MEM_REG_INT0;
+
+		msg.data = mem_map;
 	}

 	/* We try to distribute different IRQs to different tiles. */
 	cpu = tile_irq_cpu(irq);

 	/*
-	 * Now call up to the HV to configure the Mem-Map interrupt and
+	 * Now call up to the HV to configure the MSI interrupt and
 	 * set up the IPI binding.
 	 */
-	mem_map_base = MEM_MAP_INTR_REGIONS_BASE +
-		mem_map * MEM_MAP_INTR_REGION_SIZE;
-	mem_map_limit = mem_map_base + MEM_MAP_INTR_REGION_SIZE - 1;
-
 	ret = gxio_trio_config_msi_intr(trio_context, cpu_x(cpu), cpu_y(cpu),
 					KERNEL_PL, irq, controller->mac,
 					mem_map, mem_map_base, mem_map_limit,
@@ -1516,13 +1763,9 @@

 	irq_set_msi_desc(irq, desc);

-	msi_addr = mem_map_base + TRIO_MAP_MEM_REG_INT3 - TRIO_MAP_MEM_REG_INT0;
-
 	msg.address_hi = msi_addr >> 32;
 	msg.address_lo = msi_addr & 0xffffffff;

-	msg.data = mem_map;
-
 	write_msi_msg(irq, &msg);
 	irq_set_chip_and_handler(irq, &tilegx_msi_chip, handle_level_irq);
 	irq_set_handler_data(irq, controller);


What we got after my fix:

pci 0000:00:00.0: BAR 8: assigned [mem 0x100c0000000-0x100c00fffff]
pci 0000:00:00.0: BAR 9: assigned [mem 0x100c0100000-0x100c01fffff pref]
pci 0000:00:00.0: BAR 7: assigned [io  0x1000-0x1fff]
pci 0000:01:00.0: BAR 6: assigned [mem 0x100c0100000-0x100c013ffff pref]
pci 0000:01:00.0: BAR 6: set to [mem 0x100c0100000-0x100c013ffff pref]
(PCI address [0xc0100000-0xc013ffff])
pci 0000:01:00.0: BAR 4: assigned [mem 0x100c0000000-0x100c000ffff
64bit]
pci 0000:01:00.0: BAR 4: set to [mem 0x100c0000000-0x100c000ffff
64bit] (PCI address [0xc0000000-0xc000ffff])
pci 0000:01:00.0: BAR 2: assigned [io  0x1000-0x107f]
pci 0000:01:00.0: BAR 2: set to [io  0x1000-0x107f] (PCI address
[0x1000-0x107f])
pci 0000:00:00.0: PCI bridge to [bus 01-01]
pci 0000:00:00.0:   bridge window [io  0x1000-0x1fff]
pci 0000:00:00.0:   bridge window [mem 0x100c0000000-0x100c00fffff]
pci 0000:00:00.0:   bridge window [mem 0x100c0100000-0x100c01fffff
pref]
pci 0001:00:00.0: BAR 8: assigned [mem 0x101c0000000-0x101c00fffff]
pci 0001:00:00.0: BAR 9: assigned [mem 0x101c0100000-0x101c01fffff
pref]
pci 0001:00:00.0: BAR 7: assigned [io  0x80001000-0x80001fff]
pci 0001:01:00.0: BAR 6: assigned [mem 0x101c0100000-0x101c013ffff
pref]
pci 0001:01:00.0: BAR 6: set to [mem 0x101c0100000-0x101c013ffff pref]
(PCI address [0xc0100000-0xc013ffff])
pci 0001:01:00.0: BAR 4: assigned [mem 0x101c0000000-0x101c000ffff
64bit]
pci 0001:01:00.0: BAR 4: set to [mem 0x101c0000000-0x101c000ffff
64bit] (PCI address [0xc0000000-0xc000ffff])
pci 0001:01:00.0: BAR 2: assigned [io  0x80001000-0x8000107f]
pci 0001:01:00.0: BAR 2: set to [io  0x80001000-0x8000107f] (PCI
address [0x80001000-0x8000107f])
pci 0001:00:00.0: PCI bridge to [bus 01-01]
pci 0001:00:00.0:   bridge window [io  0x80001000-0x80001fff]
pci 0001:00:00.0:   bridge window [mem 0x101c0000000-0x101c00fffff]
pci 0001:00:00.0:   bridge window [mem 0x101c0100000-0x101c01fffff
pref]
pci 0000:00:00.0: enabling device (0006 -> 0007)
pci 0001:00:00.0: enabling device (0006 -> 0007)
pci_bus 0000:00: resource 0 [io  0x1000-0x800007ff]
pci_bus 0000:00: resource 1 [mem 0x100c0000000-0x100ffffffff]
pci_bus 0000:01: resource 0 [io  0x1000-0x1fff]
pci_bus 0000:01: resource 1 [mem 0x100c0000000-0x100c00fffff]
pci_bus 0000:01: resource 2 [mem 0x100c0100000-0x100c01fffff pref]
pci_bus 0001:00: resource 0 [io  0x80000800-0xffffffff]
pci_bus 0001:00: resource 1 [mem 0x101c0000000-0x101ffffffff]
pci_bus 0001:01: resource 0 [io  0x80001000-0x80001fff]
pci_bus 0001:01: resource 1 [mem 0x101c0000000-0x101c00fffff]
pci_bus 0001:01: resource 2 [mem 0x101c0100000-0x101c01fffff pref]
......
mvsas 0000:01:00.0: mvsas: driver version 0.8.2
mvsas 0000:01:00.0: enabling device (0000 -> 0003)
mvsas 0000:01:00.0: enabling bus mastering
mvsas 0000:01:00.0: mvsas: PCI-E x4, Bandwidth Usage: 2.5 Gbps
scsi0 : mvsas
......
mvsas 0001:01:00.0: mvsas: driver version 0.8.2
mvsas 0001:01:00.0: enabling device (0000 -> 0003)
mvsas 0001:01:00.0: enabling bus mastering
mvsas 0001:01:00.0: mvsas: PCI-E x4, Bandwidth Usage: 2.5 Gbps
scsi1 : mvsas


It works now. But I really need some one to confirm whether my
modification is enough or not,
if there have other potential problems.



Best regards.

-- 
Cyberman Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists