linux-kernel - Re: [PATCH] xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c8640680-b179-a6d6-cbb9-825d4bf0017d@oracle.com>
Date:   Mon, 5 Jun 2017 10:10:56 -0400
From:   Boris Ostrovsky <boris.ostrovsky@...cle.com>
To:     Anoob Soman <anoob.soman@...rix.com>,
        xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org
Cc:     jgross@...e.com
Subject: Re: [PATCH] xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next
 online VCPU

On 06/05/2017 06:14 AM, Anoob Soman wrote:
> On 02/06/17 17:24, Boris Ostrovsky wrote:
>>>     static int set_affinity_irq(struct irq_data *data, const struct
>>> cpumask *dest,
>>>                   bool force)
>>> diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c
>>> index 10f1ef5..1192f24 100644
>>> --- a/drivers/xen/evtchn.c
>>> +++ b/drivers/xen/evtchn.c
>>> @@ -58,6 +58,8 @@
>>>   #include <xen/xen-ops.h>
>>>   #include <asm/xen/hypervisor.h>
>>>   +static DEFINE_PER_CPU(int, bind_last_selected_cpu);
>> This should be moved into evtchn_bind_interdom_next_vcpu() since that's
>> the only place referencing it.
>
> Sure, I will do it.
>
>>
>> Why is it a percpu variable BTW? Wouldn't making it global result in
>> better interrupt distribution?
>
> The reason for percpu instead of global, was to avoid locking. We can
> have a global variable (last_cpu) without locking, but value of
> last_cpu wont be consistent, without locks. Moreover, since
> irq_affinity is also used in the calculation of cpu to bind, having a
> percpu or global wouldn't really matter, as the result (selected_cpu)
> is more likely to be random (because different irqs can have different
> affinity). What do you guys suggest.

Doesn't initial affinity (which is what we expect here since irqbalance
has not run yet) typically cover all guest VCPUs?

>
>>
>>> +
>>>   struct per_user_data {
>>>       struct mutex bind_mutex; /* serialize bind/unbind operations */
>>>       struct rb_root evtchns;
>>> @@ -421,6 +423,36 @@ static void evtchn_unbind_from_user(struct
>>> per_user_data *u,
>>>       del_evtchn(u, evtchn);
>>>   }
>>>   +static void evtchn_bind_interdom_next_vcpu(int evtchn)
>>> +{
>>> +    unsigned int selected_cpu, irq;
>>> +    struct irq_desc *desc = NULL;
>>> +    unsigned long flags;
>>> +
>>> +    irq = irq_from_evtchn(evtchn);
>>> +    desc = irq_to_desc(irq);
>>> +
>>> +    if (!desc)
>>> +        return;
>>> +
>>> +    raw_spin_lock_irqsave(&desc->lock, flags);
>>> +    selected_cpu = this_cpu_read(bind_last_selected_cpu);
>>> +    selected_cpu = cpumask_next_and(selected_cpu,
>>> +            desc->irq_common_data.affinity, cpu_online_mask);
>>> +
>>> +    if (unlikely(selected_cpu >= nr_cpu_ids))
>>> +        selected_cpu =
>>> cpumask_first_and(desc->irq_common_data.affinity,
>>> +                cpu_online_mask);
>>> +
>>> +    raw_spin_unlock_irqrestore(&desc->lock, flags);
>> I think if you follow Juergen's suggestion of wrapping everything into
>> irq_enable/disable you can drop the lock altogether (assuming you keep
>> bind_last_selected_cpu percpu).
>>
>> -boris
>>
>
> I think we would still require spin_lock(). spin_lock is for irq_desc.

If you are trying to protect affinity then it may well change after you
drop the lock.

In fact, don't you have a race here? If we offline a VCPU we will (by
way of cpu_disable_common()->fixup_irqs()) update affinity to reflect
that a CPU is gone and there is a chance that xen_rebind_evtchn_to_cpu()
will happen after that.

So, contrary to what I said earlier ;-) not only do you need the lock,
but you should hold it across xen_rebind_evtchn_to_cpu() call. Does this
make sense?

-boris


>
>>> +    this_cpu_write(bind_last_selected_cpu, selected_cpu);
>>> +
>>> +    local_irq_disable();
>>> +    /* unmask expects irqs to be disabled */
>>> +    xen_rebind_evtchn_to_cpu(evtchn, selected_cpu);
>>> +    local_irq_enable();
>>> +}
>>> +
>>>
>