linux-kernel - Re: [PATCH] irqchip/gic-v3-its: Balance initial LPI affinity across CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <88d64d51-4344-e908-b55b-0583b0137ddf@huawei.com>
Date:   Tue, 21 Jan 2020 10:46:41 +0000
From:   John Garry <john.garry@...wei.com>
To:     Thomas Gleixner <tglx@...utronix.de>, Marc Zyngier <maz@...nel.org>
CC:     <linux-kernel@...r.kernel.org>,
        Jason Cooper <jason@...edaemon.net>,
        "Ming Lei" <ming.lei@...hat.com>,
        "chenxiang (M)" <chenxiang66@...ilicon.com>
Subject: Re: [PATCH] irqchip/gic-v3-its: Balance initial LPI affinity across
 CPUs

On 20/01/2020 19:24, Thomas Gleixner wrote:
> Marc,
> 
> Marc Zyngier <maz@...nel.org> writes:
>> We're stuck between a rock and a hard place here:
>>
>> (1) We place all interrupts on the least loaded CPU that matches
>>       the affinity -> results in performance issues on some funky
>>       HW (like D05's SAS controller).

I think that the software driver was more of the issue in that case, 
which I'm fixing in the driver by spreading the interrupts properly.

But I am not sure which other platforms rely on this behavior.

>>
>> (2) We place managed interrupts on the least loaded CPU that matches
>>       the affinity -> we have artificial load on NUMA boundaries, and
>>       reduced spread of overlapping managed interrupts.
>>
>> (3) We don't account for non-managed LPIs, and we run the risk of
>>       unpredictable performance because we don't really know where
>>       the *other* interrupts are.
>>
>> My personal preference would be to go for (1), as in my original post.

That seems reasonable, but I like how x86 accounts only for managed 
interrupt count per-cpu when choosing the target cpu (for a managed 
interrupt).

>> I find (3) the least appealing, because we don't track things anymore.
>> (2) feels like "the least of all evils", as it is a decent performance
>> gain, seems to give predictable performance, and doesn't regress lesser
>> systems...
>>
>> I'm definitely open to suggestions here.
> 
> The way x86 does it and that's mostly ok except for some really broken
> setups is:
> 
> 1) Non-managed interrupts:
> 
>     If the interrupt is bound to a node, then we try to find a target
> 
>       I)  in the intersection of affinity mask and node mask.
> 
>       II) in the nodemask itself
> 
>           Yes we ignore affinity mask there because that's pretty much
>           the same as if the given affinity does not contain an online
>           CPU.
> 
>       If all of that fails then we try the nodeless mode
> 
>     If the interrupt is not bound to a node, then we try to find a target
> 
>       I)  in the intersection of affinity mask and online mask.
> 
>       II) in the onlinemask itself
> 
>    Each step searches for the CPU in the searched mask which has the
>    least number of total interrupts assigned.
> 
> 2) Managed interrupts
> 
>    For managed interrupts we just search in the intersection of assigned
>    mask and online CPUs for the CPU with the least number of managed
>    interrupts.

As above, this is something which I prefer we do.

> 
>    If no CPU is online then the interrupt is shutdown anyway, so no
>    fallback required.
> 
> Don't know whether that's something you can map to ARM64, but I assume
> the principle of trying to enforce NUMA locality plus balancing the
> number of interrupts makes sense in general.
I guess that we could use irq matrix code directly if we wanted to go 
this way. That's why it is in a common location...

Cheers,
John