linux-kernel - Re: [PATCH] xen/events: Fix Global and Domain VIRQ tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <238b2fd0-33ab-4279-9205-de58332fa944@amd.com>
Date: Thu, 14 Aug 2025 17:04:15 -0400
From: Jason Andryuk <jason.andryuk@....com>
To: Jürgen Groß <jgross@...e.com>, Stefano Stabellini
	<sstabellini@...nel.org>, Oleksandr Tyshchenko
	<oleksandr_tyshchenko@...m.com>, Chris Wright <chrisw@...s-sol.org>, "Jeremy
 Fitzhardinge" <jeremy@...source.com>
CC: <stable@...r.kernel.org>, <xen-devel@...ts.xenproject.org>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] xen/events: Fix Global and Domain VIRQ tracking

On 2025-08-14 03:05, Jürgen Groß wrote:
> On 13.08.25 17:03, Jason Andryuk wrote:
>> On 2025-08-12 15:00, Jason Andryuk wrote:
>>> VIRQs come in 3 flavors, per-VPU, per-domain, and global.  The existing
>>> tracking of VIRQs is handled by per-cpu variables virq_to_irq.
>>>
>>> The issue is that bind_virq_to_irq() sets the per_cpu virq_to_irq at
>>> registration time - typically CPU 0.  Later, the interrupt can migrate,
>>> and info->cpu is updated.  When calling unbind_from_irq(), the per-cpu
>>> virq_to_irq is cleared for a different cpu.  If bind_virq_to_irq() is
> 
> This is what needs to be fixed. At migration the per_cpu virq_to_irq of the
> source and the target cpu need to be updated to reflect that migration.

I considered this, and even implemented it, before changing my approach. 
  My concern was that the single VIRQ is now in one of the N per_cpu 
virq_to_irq arrays.  A second attempt to register on CPU 0 will probably 
find -1 and continue and issue the hypercall.

It looks like Xen tracks virq on the bind_virq vcpu, so 
per-domain/global stays on vcpu0.  Binding again would return -EEXISTS. 
find_virq() would not match the virq if it was re-bound to a different vcpu.

If we don't care about handling duplicate registration, then updating 
the virq_to_irq tables should be is fine.

>>> called again with CPU 0, the stale irq is returned.
>>>
>>> Change the virq_to_irq tracking to use CPU 0 for per-domain and global
>>> VIRQs.  As there can be at most one of each, there is no need for
>>> per-vcpu tracking.  Also, per-domain and global VIRQs need to be
>>> registered on CPU 0 and can later move, so this matches the expectation.
>>>
>>> Fixes: e46cdb66c8fc ("xen: event channels")
>>> Cc: stable@...r.kernel.org
>>> Signed-off-by: Jason Andryuk <jason.andryuk@....com>
>>> ---
>>> Fixes is the introduction of the virq_to_irq per-cpu array.
>>>
>>> This was found with the out-of-tree argo driver during suspend/resume.
>>> On suspend, the per-domain VIRQ_ARGO is unbound.  On resume, the driver
>>> attempts to bind VIRQ_ARGO.  The stale irq is returned, but the
>>> WARN_ON(info == NULL || info->type != IRQT_VIRQ) in bind_virq_to_irq()
>>> triggers for NULL info.  The bind fails and execution continues with the
>>> driver trying to clean up by unbinding.  This eventually faults over the
>>> NULL info.
>>> ---
>>>   drivers/xen/events/events_base.c | 17 ++++++++++++++++-
>>>   1 file changed, 16 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/ 
>>> events_base.c
>>> index 41309d38f78c..a27e4d7f061e 100644
>>> --- a/drivers/xen/events/events_base.c
>>> +++ b/drivers/xen/events/events_base.c
>>> @@ -159,7 +159,19 @@ static DEFINE_MUTEX(irq_mapping_update_lock);
>>>   static LIST_HEAD(xen_irq_list_head);
>>> -/* IRQ <-> VIRQ mapping. */
>>> +static bool is_per_vcpu_virq(int virq) {
>>> +    switch (virq) {
>>> +    case VIRQ_TIMER:
>>> +    case VIRQ_DEBUG:
>>> +    case VIRQ_XENOPROF:
>>> +    case VIRQ_XENPMU:
>>> +        return true;
>>> +    default:
>>> +        return false;
>>> +    }
>>> +}
>>> +
>>> +/* IRQ <-> VIRQ mapping.  Global/Domain virqs are tracked in cpu 0.  */
>>>   static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... 
>>> NR_VIRQS-1] = -1};
>>>   /* IRQ <-> IPI mapping */
>>> @@ -974,6 +986,9 @@ static void __unbind_from_irq(struct irq_info 
>>> *info, unsigned int irq)
>>>           switch (info->type) {
>>>           case IRQT_VIRQ:
>>> +            if (!is_per_vcpu_virq(virq_from_irq(info)))
>>> +                cpu = 0;
>>> +
>>>               per_cpu(virq_to_irq, cpu)[virq_from_irq(info)] = -1;
>>>               break;
>>>           case IRQT_IPI:
>>
>> Thinking about it a little more, bind_virq_to_irq() should ensure cpu 
>> == 0 for per-domain and global VIRQs to ensure the property holds.  
>> Also virq_to_irq 
> 
> In Xen's evtchn_bind_virq() there is:
> 
>      if ( type != VIRQ_VCPU && vcpu != 0 )
>          return -EINVAL;
> 
> Making sure in Linux that there is never a violation of that restriction 
> would
> require to always have an up-to-date table of all possible VIRQs and their
> type, which I'd like to avoid.

Yes, I agree with this.

> I think it is the user of the VIRQ who is responsible to ensure cpu 0 is 
> passed
> to bind_virq_to_irq(), as this user knows that such a restriction 
> applies to
> the VIRQ in question (at least he should know that).
> 
> Special handling for really used VIRQs in the kernel can have some special
> handling, of course, as they are known already and should be used 
> correctly.

Thanks,
Jason