lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0cbf645b-b23a-6c85-4389-bb039a677a52@intel.com>
Date:   Thu, 17 Nov 2022 15:33:49 -0800
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>
CC:     <x86@...nel.org>, Joerg Roedel <joro@...tes.org>,
        Will Deacon <will@...nel.org>, <linux-pci@...r.kernel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        "Marc Zyngier" <maz@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Jason Gunthorpe <jgg@...lanox.com>,
        Dave Jiang <dave.jiang@...el.com>,
        "Alex Williamson" <alex.williamson@...hat.com>,
        Kevin Tian <kevin.tian@...el.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Logan Gunthorpe <logang@...tatee.com>,
        Ashok Raj <ashok.raj@...el.com>, Jon Mason <jdmason@...zu.us>,
        Allen Hubbe <allenbh@...il.com>,
        "Ahmed S. Darwish" <darwi@...utronix.de>
Subject: Re: [patch 21/33] genirq/msi: Provide msi_domain_alloc_irq_at()

Hi Thomas,

I am trying all three parts of this work out with some experimental
code within the IDXD driver that attempts to use IMS on the host.

In the test, pci_ims_alloc_irq() always encounters -EBUSY and it
seems that there is an attempt to insert the struct msi_desc into
the xarray twice, the second attempt encountering the -EBUSY.

While trying to understand what is going on I found myself looking
at this code area and I'll annotate this patch with what I learned.

On 11/11/2022 5:58 AM, Thomas Gleixner wrote:

...

> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -39,6 +39,7 @@ static inline int msi_sysfs_create_group
>  /* Invalid XA index which is outside of any searchable range */
>  #define MSI_XA_MAX_INDEX	(ULONG_MAX - 1)
>  #define MSI_XA_DOMAIN_SIZE	(MSI_MAX_INDEX + 1)
> +#define MSI_ANY_INDEX		UINT_MAX
>  
>  static inline void msi_setup_default_irqdomain(struct device *dev, struct msi_device_data *md)
>  {
> @@ -126,18 +127,34 @@ static int msi_insert_desc(struct device

When calling pci_ims_alloc_irq(), msi_insert_desc() ends up being
called twice, first with index = MSI_ANY_INDEX, second with index = 0.
(domid = 1 both times)

>  	}
>  
>  	hwsize = msi_domain_get_hwsize(dev, domid);
> -	if (index >= hwsize) {
> -		ret = -ERANGE;
> -		goto fail;
> -	}
>  
> -	desc->msi_index = index;
> -	index += baseidx;
> -	ret = xa_insert(&md->__store, index, desc, GFP_KERNEL);
> -	if (ret)
> -		goto fail;
> -	return 0;
> +	if (index == MSI_ANY_INDEX) {
> +		struct xa_limit limit;
> +		unsigned int index;
> +
> +		limit.min = baseidx;
> +		limit.max = baseidx + hwsize - 1;
>  
> +		/* Let the xarray allocate a free index within the limits */
> +		ret = xa_alloc(&md->__store, &index, desc, limit, GFP_KERNEL);
> +		if (ret)
> +			goto fail;
> +

This path (index == MSI_ANY_INDEX) is followed when msi_insert_desc()
is called the first time and the xa_alloc() succeeds at index 65536.

> +		desc->msi_index = index;

This is problematic with desc->msi_index being a u16, assigning
65536 to it becomes 0.

> +		return 0;
> +	} else {
> +		if (index >= hwsize) {
> +			ret = -ERANGE;
> +			goto fail;
> +		}
> +
> +		desc->msi_index = index;
> +		index += baseidx;
> +		ret = xa_insert(&md->__store, index, desc, GFP_KERNEL);
> +		if (ret)
> +			goto fail;

This "else" path is followed when msi_insert_desc() is called the second
time with "index = 0". The xa_insert() above fails at index 65536
(baseidx = 65536) with -EBUSY, trickling up as the return code to
pci_ims_alloc_irq().

> +		return 0;
> +	}
>  fail:
>  	msi_free_desc(desc);
>  	return ret;
> @@ -335,7 +352,7 @@ int msi_setup_device_data(struct device
>  
>  	msi_setup_default_irqdomain(dev, md);
>  
> -	xa_init(&md->__store);
> +	xa_init_flags(&md->__store, XA_FLAGS_ALLOC);
>  	mutex_init(&md->mutex);
>  	md->__iter_idx = MSI_XA_MAX_INDEX;
>  	dev->msi.data = md;
> @@ -1423,6 +1440,72 @@ int msi_domain_alloc_irqs_all_locked(str
>  	return msi_domain_alloc_locked(dev, &ctrl);
>  }
>  
> +/**
> + * msi_domain_alloc_irq_at - Allocate an interrupt from a MSI interrupt domain at
> + *			     a given index - or at the next free index
> + *
> + * @dev:	Pointer to device struct of the device for which the interrupts
> + *		are allocated
> + * @domid:	Id of the interrupt domain to operate on
> + * @index:	Index for allocation. If @index == %MSI_ANY_INDEX the allocation
> + *		uses the next free index.
> + * @affdesc:	Optional pointer to an interrupt affinity descriptor structure
> + * @cookie:	Optional pointer to a descriptor specific cookie to be stored
> + *		in msi_desc::data. Must be NULL for MSI-X allocations
> + *
> + * This requires a MSI interrupt domain which lets the core code manage the
> + * MSI descriptors.
> + *
> + * Return: struct msi_map
> + *
> + *	On success msi_map::index contains the allocated index number and
> + *	msi_map::virq the corresponding Linux interrupt number
> + *
> + *	On failure msi_map::index contains the error code and msi_map::virq
> + *	is %0.
> + */
> +struct msi_map msi_domain_alloc_irq_at(struct device *dev, unsigned int domid, unsigned int index,
> +				       const struct irq_affinity_desc *affdesc,
> +				       union msi_dev_cookie *cookie)
> +{
> +	struct irq_domain *domain;
> +	struct msi_map map = { };
> +	struct msi_desc *desc;
> +	int ret;
> +
> +	msi_lock_descs(dev);
> +	domain = msi_get_device_domain(dev, domid);
> +	if (!domain) {
> +		map.index = -ENODEV;
> +		goto unlock;
> +	}
> +
> +	desc = msi_alloc_desc(dev, 1, affdesc);
> +	if (!desc) {
> +		map.index = -ENOMEM;
> +		goto unlock;
> +	}
> +
> +	if (cookie)
> +		desc->data.cookie = *cookie;
> +
> +	ret = msi_insert_desc(dev, desc, domid, index);
> +	if (ret) {
> +		map.index = ret;
> +		goto unlock;
> +	}

Above is the first call to msi_insert_desc(/* index = MSI_ANY_INDEX */)

> +
> +	map.index = desc->msi_index;

msi_insert_desc() did attempt to set desc->msi_index to 65536 but map.index ends
up being 0.

> +	ret = msi_domain_alloc_irqs_range_locked(dev, domid, map.index, map.index);

Here is where the second call to msi_insert_desc() originates:

msi_domain_alloc_irqs_range_locked() -> msi_domain_alloc_locked() -> \
__msi_domain_alloc_locked() -> msi_domain_alloc_simple_msi_descs() -> \
msi_domain_add_simple_msi_descs() -> msi_insert_desc()
		

> +	if (ret)
> +		map.index = ret;
> +	else
> +		map.virq = desc->irq;
> +unlock:
> +	msi_unlock_descs(dev);
> +	return map;
> +}
> +
>  static void __msi_domain_free_irqs(struct device *dev, struct irq_domain *domain,
>  				   struct msi_ctrl *ctrl)
>  {
> 

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ