linux-kernel - Re: About irq_create_affinity_masks() for a platform device driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <19dc0422-5536-5565-e29f-ccfbcb8525d3@huawei.com>
Date:   Fri, 31 Jan 2020 14:25:24 +0000
From:   John Garry <john.garry@...wei.com>
To:     Thomas Gleixner <tglx@...utronix.de>
CC:     Marc Zyngier <maz@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        chenxiang <chenxiang66@...ilicon.com>
Subject: Re: About irq_create_affinity_masks() for a platform device driver


>> John Garry <john.garry@...wei.com> writes:
>>> Would there be any issue with a SCSI platform device driver referencing
>>> this function?
>>>
>>> So I have a multi-queue platform device, and I want to spread interrupts
>>> over all possible CPUs, just like we can do for PCI MSI vectors. This
>>> topic was touched on in [0].
>>>
>>> And, if so it's ok, could we export that same symbol?
>>

Hi Thomas,

>> I think you will need something similar to what we have in the pci/msi
>> code, but that shouldn't be in your device driver. So I'd rather create
>> platform infrastructure for this and export that.
>>
> 
> That would seem the proper thing do to.
> 
> So I was doing this for legacy hw as a cheap and quick performance 
> boost, but I doubt how many other users there would be in future for any 
> new API. Also, the effort could be more than the reward and so I may 
> consider dropping the whole idea.
> 
> But I'll have a play with how the code could look now.

So I'd figure that an API like this would be required:

--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -11,6 +11,7 @@
  #define _PLATFORM_DEVICE_H_

  #include <linux/device.h>
+#include <linux/interrupt.h>

  #define PLATFORM_DEVID_NONE	(-1)
  #define PLATFORM_DEVID_AUTO	(-2)
@@ -27,6 +28,7 @@ struct platform_device {
  	u64		dma_mask;
  	u32		num_resources;
  	struct resource	*resource;
+	struct irq_affinity_desc *desc;

and in platform.c, adding:

/**
  * platform_get_irqs_affinity - get all IRQs for a device with affinity
  * @dev: platform device
  * @affd: Affinity descriptor
  * @count: pointer to count of IRQS
  * @irqs: pointer holder for irqs numbers
  *
  * Gets a full set of IRQs for a platform device
  *
  * Return: 0 on success, negative error number on failure.
  */
int platform_get_irqs_affinity(struct platform_device *dev, struct 
irq_affinity *affd, unsigned int *count, int **irqs)
{
	int i;
	int *pirqs;

	if (ACPI_COMPANION(&dev->dev)) {
		*count = acpi_irq_get_count(ACPI_HANDLE(&dev->dev));
	} else {
		// TODO
	}

	pirqs = kzalloc(*count * sizeof(int), GFP_KERNEL);
	if (!pirqs)
		return -ENOMEM;

	dev->desc = irq_create_affinity_masks(*count, affd);
	if (!dev->desc) {
		kfree(irqs);
		return -ENOMEM;
	}

	for (i = 0; i < *count; i++) {
		pirqs[i] = platform_get_irq(dev, i);
		if (irqs[i] < 0) {
			kfree(dev->desc);
			kfree(irqs);
			return -ENOMEM;
		}
	}

	*irqs = pirqs;

	return 0;
}
EXPORT_SYMBOL_GPL(platform_get_irqs_affinity);

Here we pass the affinity descriptor and allocate all IRQs for a device.

So this is less than a half-baked solution. We only create the affinity 
masks but do nothing with them, and the actual irq_desc 's generated 
would not would have their affinity mask set and would not be managed. 
Only the platform device driver itself would access the masks, to set 
the irq affinity hint, etc.

To achieve the proper result, we would somehow need to pass the per-IRQ 
affinity descriptor all the way down through 
platform_get_irq()->acpi_irq_get()->irq_create_fwspec_mapping()->irq_domain_alloc_irqs(), 
which could involve disruptive changes in different subsystems - not 
welcome, I'd say.

I could take the alt approach to generate the interrupt affinity masks 
in my LLDD instead. Considering I know some of the CPU and numa node 
properties of the device host, I could generate the masks in the LLDD 
itself simply, but I still would rather avoid this if possible and use 
standard APIs.

So if there are any better ideas on this, then it would be good to hear 
them.

Thanks,
john