lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <YekAgfkDgV6z6hYV@kroah.com>
Date:   Thu, 20 Jan 2022 07:26:09 +0100
From:   Greg KH <gregkh@...uxfoundation.org>
To:     Vikash Bansal <bvikas@...are.com>
Cc:     bhelgaas@...gle.com, linux-pci@...r.kernel.org,
        linux-kernel@...r.kernel.org, srivatsab@...are.com,
        srivatsa@...il.mit.edu, amakhalov@...are.com, srinidhir@...are.com,
        anishs@...are.com, vsirnapalli@...are.com, akaher@...are.com
Subject: Re: [PATCH] PCI: Speed up device init by parsing capabilities all at
 once

On Tue, Jan 18, 2022 at 09:16:01AM -0800, Vikash Bansal wrote:
> In the current implementation, the PCI capability list is parsed from
> the beginning to find each capability, which results in a large number
> of redundant PCI reads.
> 
> Instead, we can parse the complete list just once, store it in the
> pci_dev structure, and get the offset of each capability directly from
> the pci_dev structure.
> 
> This implementation improves pci devices initialization time  by ~2-3% in
> case of bare metal and 7-8% in case of VM running on ESXi.

What is that in terms of "wall clock" time?  % is hard to know here, and
of course it will depend on the PCI bus speed, right?

> It also adds a memory overhead of 20bytes (value of PCI_CAP_ID_MAX) per
> PCI device.
> 
> Signed-off-by: Vikash Bansal <bvikas@...are.com>
> ---
>  drivers/pci/pci.c   | 43 ++++++++++++++++++++++++++++++++++++-------
>  drivers/pci/probe.c |  5 +++++
>  include/linux/pci.h |  2 ++
>  3 files changed, 43 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 3d2fb394986a..8e024db30262 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -468,6 +468,41 @@ static u8 __pci_bus_find_cap_start(struct pci_bus *bus,
>  	return 0;
>  }
>  
> +
> +/**
> + * pci_find_all_capabilities - Read all capabilities
> + * @dev: the PCI device
> + *
> + * Read all capabilities and store offsets in cap_off
> + * array in pci_dev structure.
> + */
> +void pci_find_all_capabilities(struct pci_dev *dev)
> +{
> +	int ttl = PCI_FIND_CAP_TTL;
> +	u16 ent;
> +	u8 pos;
> +	u8 id;
> +
> +	pos = __pci_bus_find_cap_start(dev->bus, dev->devfn, dev->hdr_type);
> +	if (!pos)
> +		return;
> +	pci_bus_read_config_byte(dev->bus, dev->devfn, pos, &pos);
> +	while (ttl--) {
> +		if (pos < 0x40)

What is this magic value of 0x40?

> +			break;
> +		pos &= ~3;

Why ~3?

> +		pci_bus_read_config_word(dev->bus, dev->devfn, pos, &ent);
> +		id = ent & 0xff;

Do you really need the & if you are truncating it?

> +		if (id == 0xff)
> +			break;
> +
> +		/* Read first instance of capability */
> +		if (!(dev->cap_off[id]))
> +			dev->cap_off[id] = pos;

Shouldn't you have checked this before you read the value?

> +		pos = (ent >> 8);

What about walking the list using __pci_find_next_cap() like before?
Why is this somehow the same as the old function?

> +	}
> +}
> +
>  /**
>   * pci_find_capability - query for devices' capabilities
>   * @dev: PCI device to query
> @@ -489,13 +524,7 @@ static u8 __pci_bus_find_cap_start(struct pci_bus *bus,
>   */
>  u8 pci_find_capability(struct pci_dev *dev, int cap)
>  {
> -	u8 pos;
> -
> -	pos = __pci_bus_find_cap_start(dev->bus, dev->devfn, dev->hdr_type);
> -	if (pos)
> -		pos = __pci_find_next_cap(dev->bus, dev->devfn, pos, cap);
> -
> -	return pos;
> +	return dev->cap_off[cap];
>  }
>  EXPORT_SYMBOL(pci_find_capability);
>  
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 087d3658f75c..bacab12cedbb 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1839,6 +1839,11 @@ int pci_setup_device(struct pci_dev *dev)
>  	dev->hdr_type = hdr_type & 0x7f;
>  	dev->multifunction = !!(hdr_type & 0x80);
>  	dev->error_state = pci_channel_io_normal;
> +	/*
> +	 * Read all capabilities and store offsets in cap_off
> +	 * array in pci_dev structure.
> +	 */

Comment is not needed if the function name is descriptive.

> +	pci_find_all_capabilities(dev);

And it is, so no need for the comment.

>  	set_pcie_port_type(dev);
>  
>  	pci_set_of_node(dev);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 18a75c8e615c..d221c73e67f8 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -326,6 +326,7 @@ struct pci_dev {
>  	unsigned int	class;		/* 3 bytes: (base,sub,prog-if) */
>  	u8		revision;	/* PCI revision, low byte of class word */
>  	u8		hdr_type;	/* PCI header type (`multi' flag masked out) */
> +	u8              cap_off[PCI_CAP_ID_MAX]; /* Offsets of all pci capabilities */

Did you run 'pahole' to ensure you are not adding extra padding bytes
here?

>  #ifdef CONFIG_PCIEAER
>  	u16		aer_cap;	/* AER capability offset */
>  	struct aer_stats *aer_stats;	/* AER stats for this device */
> @@ -1128,6 +1129,7 @@ void pci_sort_breadthfirst(void);
>  
>  u8 pci_bus_find_capability(struct pci_bus *bus, unsigned int devfn, int cap);
>  u8 pci_find_capability(struct pci_dev *dev, int cap);
> +void pci_find_all_capabilities(struct pci_dev *dev);

Why is this now a global function and not one just local to the pci
core?  Who else would ever need to call it?

thanks,

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ