[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <88d64d51-4344-e908-b55b-0583b0137ddf@huawei.com>
Date: Tue, 21 Jan 2020 10:46:41 +0000
From: John Garry <john.garry@...wei.com>
To: Thomas Gleixner <tglx@...utronix.de>, Marc Zyngier <maz@...nel.org>
CC: <linux-kernel@...r.kernel.org>,
Jason Cooper <jason@...edaemon.net>,
"Ming Lei" <ming.lei@...hat.com>,
"chenxiang (M)" <chenxiang66@...ilicon.com>
Subject: Re: [PATCH] irqchip/gic-v3-its: Balance initial LPI affinity across
CPUs
On 20/01/2020 19:24, Thomas Gleixner wrote:
> Marc,
>
> Marc Zyngier <maz@...nel.org> writes:
>> We're stuck between a rock and a hard place here:
>>
>> (1) We place all interrupts on the least loaded CPU that matches
>> the affinity -> results in performance issues on some funky
>> HW (like D05's SAS controller).
I think that the software driver was more of the issue in that case,
which I'm fixing in the driver by spreading the interrupts properly.
But I am not sure which other platforms rely on this behavior.
>>
>> (2) We place managed interrupts on the least loaded CPU that matches
>> the affinity -> we have artificial load on NUMA boundaries, and
>> reduced spread of overlapping managed interrupts.
>>
>> (3) We don't account for non-managed LPIs, and we run the risk of
>> unpredictable performance because we don't really know where
>> the *other* interrupts are.
>>
>> My personal preference would be to go for (1), as in my original post.
That seems reasonable, but I like how x86 accounts only for managed
interrupt count per-cpu when choosing the target cpu (for a managed
interrupt).
>> I find (3) the least appealing, because we don't track things anymore.
>> (2) feels like "the least of all evils", as it is a decent performance
>> gain, seems to give predictable performance, and doesn't regress lesser
>> systems...
>>
>> I'm definitely open to suggestions here.
>
> The way x86 does it and that's mostly ok except for some really broken
> setups is:
>
> 1) Non-managed interrupts:
>
> If the interrupt is bound to a node, then we try to find a target
>
> I) in the intersection of affinity mask and node mask.
>
> II) in the nodemask itself
>
> Yes we ignore affinity mask there because that's pretty much
> the same as if the given affinity does not contain an online
> CPU.
>
> If all of that fails then we try the nodeless mode
>
> If the interrupt is not bound to a node, then we try to find a target
>
> I) in the intersection of affinity mask and online mask.
>
> II) in the onlinemask itself
>
> Each step searches for the CPU in the searched mask which has the
> least number of total interrupts assigned.
>
> 2) Managed interrupts
>
> For managed interrupts we just search in the intersection of assigned
> mask and online CPUs for the CPU with the least number of managed
> interrupts.
As above, this is something which I prefer we do.
>
> If no CPU is online then the interrupt is shutdown anyway, so no
> fallback required.
>
> Don't know whether that's something you can map to ARM64, but I assume
> the principle of trying to enforce NUMA locality plus balancing the
> number of interrupts makes sense in general.
I guess that we could use irq matrix code directly if we wanted to go
this way. That's why it is in a common location...
Cheers,
John
Powered by blists - more mailing lists