[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <64adb508-8843-4eae-87fb-47797bf32b19@linux.microsoft.com>
Date: Mon, 23 Jun 2025 15:13:49 -0700
From: Nuno Das Neves <nunodasneves@...ux.microsoft.com>
To: Thomas Gleixner <tglx@...utronix.de>,
Michael Kelley <mhklinux@...look.com>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>
Cc: "kys@...rosoft.com" <kys@...rosoft.com>,
"haiyangz@...rosoft.com" <haiyangz@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>,
"decui@...rosoft.com" <decui@...rosoft.com>,
"catalin.marinas@....com" <catalin.marinas@....com>,
"will@...nel.org" <will@...nel.org>, "mingo@...hat.com" <mingo@...hat.com>,
"bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"hpa@...or.com" <hpa@...or.com>,
"lpieralisi@...nel.org" <lpieralisi@...nel.org>, "kw@...ux.com"
<kw@...ux.com>, "robh@...nel.org" <robh@...nel.org>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"jinankjain@...ux.microsoft.com" <jinankjain@...ux.microsoft.com>,
"skinsburskii@...ux.microsoft.com" <skinsburskii@...ux.microsoft.com>,
"mrathor@...ux.microsoft.com" <mrathor@...ux.microsoft.com>,
"x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH 3/4] x86: hyperv: Expose hv_map_msi_interrupt function
On 6/20/2025 9:19 AM, Thomas Gleixner wrote:
> On Wed, Jun 18 2025 at 14:08, Nuno Das Neves wrote:
>> On 6/11/2025 4:07 PM, Michael Kelley wrote:
>>> From: Nuno Das Neves <nunodasneves@...ux.microsoft.com> Sent: Tuesday, June 10, 2025 4:52 PM
>>>> +/**
>>>> + * hv_map_msi_interrupt() - "Map" the MSI IRQ in the hypervisor.
>>>> + * @data: Describes the IRQ
>>>> + * @out_entry: Hypervisor (MSI) interrupt entry (can be NULL)
>>>> + *
>>>> + * Map the IRQ in the hypervisor by issuing a MAP_DEVICE_INTERRUPT hypercall.
>>>> + */
>>>> +int hv_map_msi_interrupt(struct irq_data *data,
>>>> + struct hv_interrupt_entry *out_entry)
>>>> {
>>>> - union hv_device_id device_id = hv_build_pci_dev_id(dev);
>>>> + struct msi_desc *msidesc;
>>>> + struct pci_dev *dev;
>>>> + union hv_device_id device_id;
>>>> + struct hv_interrupt_entry dummy;
>>>> + struct irq_cfg *cfg = irqd_cfg(data);
>>>> + const cpumask_t *affinity;
>>>> + int cpu;
>>>> + u64 res;
>
> https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#variable-declarations
>
>>>>
>>>> - return hv_map_interrupt(device_id, false, cpu, vector, entry);
>>>> + msidesc = irq_data_get_msi_desc(data);
>>>> + dev = msi_desc_to_pci_dev(msidesc);
>>>> + device_id = hv_build_pci_dev_id(dev);
>>>> + affinity = irq_data_get_effective_affinity_mask(data);
>>>> + cpu = cpumask_first_and(affinity, cpu_online_mask);
>>>
>>> Is the cpus_read_lock held at this point? I'm not sure what the
>>> overall calling sequence looks like. If it is not held, the CPU that
>>> is selected could go offline before hv_map_interrupt() is called.
>>> This computation of the target CPU is the same as in the code
>>> before this patch, but that existing code looks like it has the
>>> same problem.
>>>
>>
>> Thanks for pointing it out - It *looks* like the read lock is not held
>> everywhere this could be called, so it could indeed be a problem.
>>
>> I've been thinking about different ways around this but I lack the
>> knowledge to have an informed opinion about it:
>>
>> - We could take the cpu read lock in this function, would that work?
>
> Obviously not.
>
>> - I'm not actually sure why the code is getting the first cpu off the effective
>> affinity mask in the first place. It is possible to get the apic id (and hence
>> the cpu) already associated with the irq, as per e.g. x86_vector_msi_compose_msg()
>> Maybe we could get the cpu that way, assuming that doesn't have a
>> similar issue.
>
> There is no reason to fiddle in the underlying low level data. The
> effective affinity mask is there for a reason.
>
>> - We could just let this race happen, maybe the outcome isn't too catastrophic?
>
> Let's terminate guesswork mode and look at the facts.
>
> The point is that hv_map_msi_interrupt() is called from:
>
> 1) hv_irq_compose_msi_msg()
>
> 2) hv_arch_irq_unmask() (in patch 4/4)
>
> Both functions are interrupt chip callbacks and invoked with the
> interrupt descriptor lock held.
>
> At the point where they are called, the effective affinity mask is valid
> and immutable. Nothing can modify it as any modification requires the
> interrupt descriptor lock to be held. This applies to the CPU hotplug
> machinery too. So this AND cpu_online_mask is a complete pointless
> voodoo exercise.
>
Thanks for explaining.
> Just use:
>
> cpu = cpumask_first(irq_data_get_effective_affinity_mask(data));
>
> and be done with it.
>
> Please fix that first with a seperate patch before moving this code
> around.
Will do!
Nuno
>
> Thanks,
>
> tglx
Powered by blists - more mailing lists