lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49744142.2060806@jp.fujitsu.com>
Date:	Mon, 19 Jan 2009 18:00:50 +0900
From:	Kenji Kaneshige <kaneshige.kenji@...fujitsu.com>
To:	Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>
CC:	"Rafael J. Wysocki" <rjw@...k.pl>,
	Jesse Barnes <jbarnes@...tuousgeek.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Linux PCI <linux-pci@...r.kernel.org>
Subject: Re: [PATCH PCI PCIe portdrv: Fix allocation of interrupts (rev. 5)

Hidetoshi Seto wrote:
> Rafael J. Wysocki wrote:
>> On Saturday 17 January 2009, Rafael J. Wysocki wrote:
> (snip)
>>> If MSI-X are supported, it allocates as many vectors as there are entries
>>> in the port's MSI-X table, but no more than 32, and figures out which of them
>>> will be used for the port services.
>> The patch didn't check which services are available during the MSI-X set up
>> which was wrong.
>>
>> Also, in the meantime, i thought it might be a good idea to free the interrupt
>> routing table entries that aren't going to be used after all.
>>
>> The patch below adds this to the previous version and checks for the
>> availability of port services in the MSI-X setup resume.  I hope it will
>> be acceptable to everyone.
>>
>> Thanks,
>> Rafael
>>
>> ---
>> Subject: PCI PCIe portdrv: Fix allocation of interrupts (rev. 5)
>> From: Rafael J. Wysocki <rjw@...k.pl>
>>
>> If MSI-X interrupt mode is used by the PCI Express port driver, too
>> many vectors are allocated and it is not ensured that the right
>> vectors will be used for the right services.  Namely, the PCI Express
>> specification states that both PCI Express native PME and PCI Express
>> hotplug will always use the same MSI or MSI-X message for signalling
>> interrupts, which implies that the same vector will be used by both
>> of them.  Also, the VC service does not use interrupts at all.
>> Moreover, is not clear which of the vectors allocated by
>> pci_enable_msix() in the current code will be used for PME and
>> hotplug and which of them will be used for AER if all of these
>> services are configured.
>>
>> For these reasons, rework the allocation of interrupts for PCI
>> Express ports so that if MSI-X are enabled, the right vectors will be
>> used for the right purposes.
>>
>> Signed-off-by: Rafael J. Wysocki <rjw@...k.pl>
>> ---
>>  drivers/pci/msi.c               |   24 +++-
>>  drivers/pci/pcie/portdrv.h      |    6 +
>>  drivers/pci/pcie/portdrv_core.c |  195 ++++++++++++++++++++++++++++++++--------
>>  include/linux/pci.h             |    5 +
>>  include/linux/pcieport_if.h     |   12 +-
>>  5 files changed, 194 insertions(+), 48 deletions(-)
>>
>> Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
>> ===================================================================
>> --- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
>> +++ linux-2.6/drivers/pci/pcie/portdrv_core.c
>> @@ -31,6 +31,141 @@ static void release_pcie_device(struct d
>>  }
>>  
>>  /**
>> + * pcie_port_msix_add_entry - add entry to given array of MSI-X entries
>> + * @entries: Array of MSI-X entries
>> + * @new_entry: Index of the entry to add to the array
>> + * @nr_entries: Number of entries aleady in the array
>> + *
>> + * Return value: Position of the added entry in the array
>> + */
>> +static int pcie_port_msix_add_entry(
>> +	struct msix_entry *entries, int new_entry, int nr_entries)
>> +{
>> +	int j;
>> +
>> +	for (j = 0; j < nr_entries; j++)
>> +		if (entries[j].entry == new_entry)
>> +			return j;
>> +
>> +	entries[j].entry = new_entry;
>> +	return j;
>> +}
>> +
>> +/**
>> + * pcie_port_enable_msix - try to set up MSI-X as interrupt mode for given port
>> + * @dev: PCI Express port to handle
>> + * @vectors: Array of interrupt vectors to populate
>> + * @mask: Bitmask of port capabilities returned by get_port_device_capability()
>> + *
>> + * Return value: 0 on success, error code on failure
>> + */
>> +static int pcie_port_enable_msix(struct pci_dev *dev, int *vectors, int mask)
>> +{
>> +	struct msix_entry *msix_entries;
>> +	int idx[PCIE_PORT_DEVICE_MAXSERVICES];
>> +	int nr_entries, status, pos, i, nvec;
>> +	u16 reg16;
>> +	u32 reg32;
>> +
>> +	nr_entries = pci_msix_table_size(dev);
>> +	if (!nr_entries)
>> +		return -EINVAL;
>> +	if (nr_entries > PCIE_PORT_MAX_MSIX_ENTRIES)
>> +		nr_entries = PCIE_PORT_MAX_MSIX_ENTRIES;
>> +
>> +	msix_entries = kzalloc(sizeof(*msix_entries) * nr_entries, GFP_KERNEL);
>> +	if (!msix_entries)
>> +		return -ENOMEM;
>> +
>> +	/*
>> +	 * Allocate as many entries as the device wants temporarily, so that we
>> +	 * can check which of them will be useful.
>> +	 */
>> +	for (i = 0; i < nr_entries; i++)
>> +		msix_entries[i].entry = i;
> 
> 	/*
> 	 * So, if msix_entries is correctly equal to the number of entries this
> 	 * port actually uses, we'll happily go through without using trick.
> 	 */
>> +
>> +	status = pci_enable_msix(dev, msix_entries, nr_entries);
>> +	if (status)
>> +		goto Exit;
>> +
>> +	for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++)
>> +		idx[i] = -1;
>> +	status = -EIO;
>> +	nvec = 0;
>> +
>> +	if (mask & (PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP)) {
>> +		int entry;
>> +
>> +		/*
>> +		 * The code below follows the PCI Express Base Specification 2.0
>> +		 * stating in Section 6.1.6 that "PME and Hot-Plug Event
>> +		 * interrupts (when both are implemented) always share the same
>> +		 * MSI or MSI-X vector, as indicated by the Interrupt Message
>> +		 * Number field in the PCI Express Capabilities register", where
>> +		 * according to Section 7.8.2 of the specification "For MSI-X,
>> +		 * the value in this field indicates which MSI-X Table entry is
>> +		 * used to generate the interrupt message."
>> +		 */
>> +		pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
>> +		pci_read_config_word(dev, pos + PCIE_CAPABILITIES_REG, &reg16);
>> +		entry = (reg16 >> 9) & PCIE_PORT_MSI_VECTOR_MASK;
>> +		if (entry >= nr_entries)
>> +			goto Error;
>> +
>> +		i = pcie_port_msix_add_entry(msix_entries, entry, nvec);
>> +		if (i == nvec)
>> +			nvec++;
>> +
>> +		idx[PCIE_PORT_SERVICE_PME_SHIFT] = i;
>> +		idx[PCIE_PORT_SERVICE_HP_SHIFT] = i;
>> +	}
>> +
>> +	if (mask & PCIE_PORT_SERVICE_AER) {
>> +		int entry;
>> +
>> +		/*
>> +		 * The code below follows Section 7.10.10 of the PCI Express
>> +		 * Base Specification 2.0 stating that bits 31-27 of the Root
>> +		 * Error Status Register contain a value indicating which of the
>> +		 * MSI/MSI-X vectors assigned to the port is going to be used
>> +		 * for AER, where "For MSI-X, the value in this register
>> +		 * indicates which MSI-X Table entry is used to generate the
>> +		 * interrupt message."
>> +		 */
>> +		pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR);
>> +		pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &reg32);
>> +		entry = reg32 >> 27;
>> +		if (entry >= nr_entries)
>> +			goto Error;
>> +
>> +		i = pcie_port_msix_add_entry(msix_entries, entry, nvec);
>> +		if (i == nvec)
>> +			nvec++;
>> +
>> +		idx[PCIE_PORT_SERVICE_AER_SHIFT] = i;
>> +	}
>> +
> 
> /* Are there any unused entries? */
> if (nr_allocated > nvec) {
> 	/* this port have extra entries not for services we know... */
> 

Sounds good. That is,

If the number of entries we required equals to nr_entries (from
Table size field in Message Control register), we don't need to
drop the first MSI-X setup.

Thanks,
Kenji Kaneshige



>> +	/* Drop the temporary MSI-X setup */
>> +	pci_disable_msix(dev);
>> +
>> +	/* Now allocate the MSI-X vectors for real */
>> +	status = pci_enable_msix(dev, msix_entries, nvec);
>> +	if (status)
>> +		goto Error;
> 
> 	/*
> 	 * World have broken hardwares, so even spec says numbers are constant,
> 	 * it would be better to re-check registers after 2nd pci_enable_msix.
> 	 * Or we just skip this.  (However this was what your concern, Rafael?)
> 	 */
> 	if (func_foo_do_paranoia_check(dev, msix_entries, nvec))
> 		goto Error;	
> }
> 
>> +
>> +	for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++)
>> +		vectors[i] = idx[i] >= 0 ? msix_entries[idx[i]].vector : -1;
>> +
>> + Exit:
>> +	kfree(msix_entries);
>> +	return status;
>> +
>> + Error:
>> +	pci_disable_msix(dev);
>> +	goto Exit;
>> +}
>> +
>> +/**
>>   * assign_interrupt_mode - choose interrupt mode for PCI Express port services
>>   *                         (INTx, MSI-X, MSI) and set up vectors
>>   * @dev: PCI Express port to handle
>> @@ -42,49 +177,31 @@ static void release_pcie_device(struct d
>>  static int assign_interrupt_mode(struct pci_dev *dev, int *vectors, int mask)
>>  {
>>  	struct pcie_port_data *port_data = pci_get_drvdata(dev);
>> -	int i, pos, nvec, status = -EINVAL;
>> -	int interrupt_mode = PCIE_PORT_NO_IRQ;
>> -
>> -	/* Set INTx as default */
>> -	for (i = 0, nvec = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
>> -		if (mask & (1 << i)) 
>> -			nvec++;
>> -		vectors[i] = dev->irq;
>> -	}
>> -	if (dev->pin)
>> -		interrupt_mode = PCIE_PORT_INTx_MODE;
>> +	int irq, interrupt_mode = PCIE_PORT_NO_IRQ;
>> +	int i;
>>  
>>  	/* Check MSI quirk */
>>  	if (port_data->port_type == PCIE_RC_PORT && pcie_mch_quirk)
>> -		return interrupt_mode;
>> +		goto Fallback;
>> +
>> +	/* Try to use MSI-X if supported */
>> +	if (!pcie_port_enable_msix(dev, vectors, mask))
>> +		return PCIE_PORT_MSIX_MODE;
>> +
>> +	/* We're not going to use MSI-X, so try MSI and fall back to INTx */
>> +	if (!pci_enable_msi(dev))
>> +		interrupt_mode = PCIE_PORT_MSI_MODE;
>> +
>> + Fallback:
>> +	if (interrupt_mode == PCIE_PORT_NO_IRQ && dev->pin)
>> +		interrupt_mode = PCIE_PORT_INTx_MODE;
>> +
>> +	irq = interrupt_mode != PCIE_PORT_NO_IRQ ? dev->irq : -1;
>> +	for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++)
>> +		vectors[i] = irq;
>> +
>> +	vectors[PCIE_PORT_SERVICE_VC_SHIFT] = -1;
>>  
>> -	/* Select MSI-X over MSI if supported */		
>> -	pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
>> -	if (pos) {
>> -		struct msix_entry msix_entries[PCIE_PORT_DEVICE_MAXSERVICES] = 
>> -			{{0, 0}, {0, 1}, {0, 2}, {0, 3}};
>> -		status = pci_enable_msix(dev, msix_entries, nvec);
>> -		if (!status) {
>> -			int j = 0;
>> -
>> -			interrupt_mode = PCIE_PORT_MSIX_MODE;
>> -			for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
>> -				if (mask & (1 << i)) 
>> -					vectors[i] = msix_entries[j++].vector;
>> -			}
>> -		}
>> -	} 
>> -	if (status) {
>> -		pos = pci_find_capability(dev, PCI_CAP_ID_MSI);
>> -		if (pos) {
>> -			status = pci_enable_msi(dev);
>> -			if (!status) {
>> -				interrupt_mode = PCIE_PORT_MSI_MODE;
>> -				for (i = 0;i < PCIE_PORT_DEVICE_MAXSERVICES;i++)
>> -					vectors[i] = dev->irq;
>> -			}
>> -		}
>> -	} 
>>  	return interrupt_mode;
>>  }
>>  
>> Index: linux-2.6/include/linux/pcieport_if.h
>> ===================================================================
>> --- linux-2.6.orig/include/linux/pcieport_if.h
>> +++ linux-2.6/include/linux/pcieport_if.h
>> @@ -16,10 +16,14 @@
>>  #define PCIE_ANY_PORT			7
>>  
>>  /* Service Type */
>> -#define PCIE_PORT_SERVICE_PME		1	/* Power Management Event */
>> -#define PCIE_PORT_SERVICE_AER		2	/* Advanced Error Reporting */
>> -#define PCIE_PORT_SERVICE_HP		4	/* Native Hotplug */
>> -#define PCIE_PORT_SERVICE_VC		8	/* Virtual Channel */
>> +#define PCIE_PORT_SERVICE_PME_SHIFT	0	/* Power Management Event */
>> +#define PCIE_PORT_SERVICE_PME		(1 << PCIE_PORT_SERVICE_PME_SHIFT)
>> +#define PCIE_PORT_SERVICE_AER_SHIFT	1	/* Advanced Error Reporting */
>> +#define PCIE_PORT_SERVICE_AER		(1 << PCIE_PORT_SERVICE_AER_SHIFT)
>> +#define PCIE_PORT_SERVICE_HP_SHIFT	2	/* Native Hotplug */
>> +#define PCIE_PORT_SERVICE_HP		(1 << PCIE_PORT_SERVICE_HP_SHIFT)
>> +#define PCIE_PORT_SERVICE_VC_SHIFT	3	/* Virtual Channel */
>> +#define PCIE_PORT_SERVICE_VC		(1 << PCIE_PORT_SERVICE_VC_SHIFT)
>>  
>>  /* Root/Upstream/Downstream Port's Interrupt Mode */
>>  #define PCIE_PORT_NO_IRQ		(-1)
>> Index: linux-2.6/drivers/pci/pcie/portdrv.h
>> ===================================================================
>> --- linux-2.6.orig/drivers/pci/pcie/portdrv.h
>> +++ linux-2.6/drivers/pci/pcie/portdrv.h
>> @@ -25,6 +25,12 @@
>>  #define PCIE_CAPABILITIES_REG		0x2
>>  #define PCIE_SLOT_CAPABILITIES_REG	0x14
>>  #define PCIE_PORT_DEVICE_MAXSERVICES	4
>> +#define PCIE_PORT_MSI_VECTOR_MASK	0x1f
>> +/*
>> + * According to the PCI Express Base Specification 2.0, the indices of the MSI-X
>> + * table entires used by port services must not exceed 31
>> + */
>> +#define PCIE_PORT_MAX_MSIX_ENTRIES	32
>>  
>>  #define get_descriptor_id(type, service) (((type - 4) << 4) | service)
>>  
>> Index: linux-2.6/drivers/pci/msi.c
>> ===================================================================
>> --- linux-2.6.orig/drivers/pci/msi.c
>> +++ linux-2.6/drivers/pci/msi.c
>> @@ -670,6 +670,23 @@ static int msi_free_irqs(struct pci_dev*
>>  }
>>  
>>  /**
>> + * pci_msix_table_size - return the number of device's MSI-X table entries
>> + * @dev: pointer to the pci_dev data structure of MSI-X device function
>> + */
>> +int pci_msix_table_size(struct pci_dev *dev)
>> +{
>> +	int pos;
>> +	u16 control;
>> +
>> +	pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
>> +	if (!pos)
>> +		return 0;
>> +
>> +	pci_read_config_word(dev, msi_control_reg(pos), &control);
>> +	return multi_msix_capable(control);
>> +}
>> +
>> +/**
> 
> I think this pci_msix_table_size() is useful alone.
> It would be nice if we can have separated patches.
> 
> i.e.:
>  [PATCH] PCI/MSI: introduce pci_msix_table_size()
>  [PATCH] PCI PCIe portdrv: Fix allocation of interrupts (rev. 6)
> 
> 
> Thanks,
> H.Seto
> 
>>   * pci_enable_msix - configure device's MSI-X capability structure
>>   * @dev: pointer to the pci_dev data structure of MSI-X device function
>>   * @entries: pointer to an array of MSI-X entries
>> @@ -686,9 +703,8 @@ static int msi_free_irqs(struct pci_dev*
>>   **/
>>  int pci_enable_msix(struct pci_dev* dev, struct msix_entry *entries, int nvec)
>>  {
>> -	int status, pos, nr_entries;
>> +	int status, nr_entries;
>>  	int i, j;
>> -	u16 control;
>>  
>>  	if (!entries)
>>   		return -EINVAL;
>> @@ -697,9 +713,7 @@ int pci_enable_msix(struct pci_dev* dev,
>>  	if (status)
>>  		return status;
>>  
>> -	pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
>> -	pci_read_config_word(dev, msi_control_reg(pos), &control);
>> -	nr_entries = multi_msix_capable(control);
>> +	nr_entries = pci_msix_table_size(dev);
>>  	if (nvec > nr_entries)
>>  		return -EINVAL;
>>  
>> Index: linux-2.6/include/linux/pci.h
>> ===================================================================
>> --- linux-2.6.orig/include/linux/pci.h
>> +++ linux-2.6/include/linux/pci.h
>> @@ -799,6 +799,10 @@ static inline void pci_msi_shutdown(stru
>>  static inline void pci_disable_msi(struct pci_dev *dev)
>>  { }
>>  
>> +static inline int pci_msix_table_size(struct pci_dev *dev)
>> +{
>> +	return 0;
>> +}
>>  static inline int pci_enable_msix(struct pci_dev *dev,
>>  				  struct msix_entry *entries, int nvec)
>>  {
>> @@ -823,6 +827,7 @@ static inline int pci_msi_enabled(void)
>>  extern int pci_enable_msi(struct pci_dev *dev);
>>  extern void pci_msi_shutdown(struct pci_dev *dev);
>>  extern void pci_disable_msi(struct pci_dev *dev);
>> +extern int pci_msix_table_size(struct pci_dev *dev);
>>  extern int pci_enable_msix(struct pci_dev *dev,
>>  	struct msix_entry *entries, int nvec);
>>  extern void pci_msix_shutdown(struct pci_dev *dev);
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ