lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110421202631.GE15612@kroah.com>
Date:	Thu, 21 Apr 2011 13:26:32 -0700
From:	Greg KH <greg@...ah.com>
To:	Neil Horman <nhorman@...driver.com>
Cc:	linux-kernel@...r.kernel.org,
	Jesse Barnes <jbarnes@...tuousgeek.org>,
	linux-pci@...r.kernel.org
Subject: Re: [PATCH] pci: Export pci device msi table via sysfs

On Thu, Apr 21, 2011 at 02:57:47PM -0400, Neil Horman wrote:
> I've been working on some improvements to how we balance irqs for high volume
> interrupt sources.  The consensus so far has been that what would be really
> helpful is a irqbalance mechanism that operates in a one shot mode in response
> to the addition of high volume interrupt sources (i.e. network devices mainly).
> In attempting to implement this, I've found that it would be really useful to
> have 2 bits of information:
> 
> 1) A clear correlation of which interrupts belong to which devices.  Parsing
> /proc/interrupts is an exercize in guessing what naming pattern a given driver
> follows, and requires some amount of stateful information to be kept in user
> space, lest every device addition require a rebalancing of every interrupt in
> the system.
> 
> 2) A indicator as to which kind of interrupts a given device is using.  The irq
> attribute for a pci device is always accurate in that it simply reads whats in
> the appropriate pci config space register, but devices using msi interrupts have
> no use for that register, and never request that interrupt number.
> 
> This patch adds the requisite information.  It creates two per-pci-device irq
> attribute files:
> 
> a) irq_mode - identifies which kind of irqs the device in question is using,
> intx/msi/msix
> 
> b) msi_table - populated only if msi(x) is enabled, it lists the irqs allocated
> to the pci device
> 
> Using this information I can implement a stateless irq one-shot balancer that
> reacts to various udev events quite well
> 
> Signed-off-by: Neil Horman <nhorman@...driver.com>
> CC: Jesse Barnes <jbarnes@...tuousgeek.org>
> CC: linux-pci@...r.kernel.org
> ---
>  drivers/pci/pci-sysfs.c |   33 +++++++++++++++++++++++++++++++++

The reason why this will not work has already been described by Matthew,
but I want to point out that if you ever add/delete/change a sysfs file
in the kernel, you also need to add/delete/change a Documentation/ABI/
file as well.

Writing that file would have shown you how this really isn't going to
work, which is usually a good exercise in itself :)

>  1 files changed, 33 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index f8deb3e..1397dfb 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -26,6 +26,7 @@
>  #include <linux/security.h>
>  #include <linux/pci-aspm.h>
>  #include <linux/slab.h>
> +#include <linux/msi.h>
>  #include "pci.h"
>  
>  static int sysfs_initialized;	/* = 0 */
> @@ -71,6 +72,34 @@ static ssize_t broken_parity_status_store(struct device *dev,
>  	return count;
>  }
>  
> +static ssize_t irq_mode_show(struct device *dev,
> +			struct device_attribute *attr, char *buf)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +
> +	return sprintf(buf, "%s\n", pdev->msix_enabled ? "msix" :
> +				   (pdev->msi_enabled ? "msi" : "intx"));
> +}
> +
> +#ifdef CONFIG_PCI_MSI
> +static ssize_t msi_list_show(struct device *dev,
> +			struct device_attribute *attr, char *buf)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct msi_desc *entry;
> +	int first, last;
> +	ssize_t count = 0;
> +
> +	if (!(pdev->msi_enabled || pdev->msix_enabled))
> +		return 0;
> +
> +	list_for_each_entry(entry, &pdev->msi_list, list)
> +		count += sprintf(&buf[count], "%d ", entry->irq);

This breaks the "one-item-per-file" sysfs rule, so please never do this.


> +
> +	return count;
> +}
> +#endif
> +
>  static ssize_t local_cpus_show(struct device *dev,
>  			struct device_attribute *attr, char *buf)
>  {		
> @@ -328,6 +357,10 @@ struct device_attribute pci_dev_attrs[] = {
>  	__ATTR_RO(subsystem_device),
>  	__ATTR_RO(class),
>  	__ATTR_RO(irq),
> +	__ATTR_RO(irq_mode),
> +#ifdef CONFIG_PCI_MSI
> +	__ATTR_RO(msi_list),
> +#endif

Curious as to why you added this to the middle of the list?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ