linux-kernel - Re: [PATCH v3 2/2] irqchip/gic-v3-its: Balance initial LPI affinity across CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7b97c24ceced7560b5acb03edaf2cd70@kernel.org>
Date:   Wed, 18 Mar 2020 14:16:37 +0000
From:   Marc Zyngier <maz@...nel.org>
To:     John Garry <john.garry@...wei.com>
Cc:     linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        chenxiang <chenxiang66@...ilicon.com>,
        Zhou Wang <wangzhou1@...ilicon.com>,
        Ming Lei <ming.lei@...hat.com>,
        Jason Cooper <jason@...edaemon.net>,
        Thomas Gleixner <tglx@...utronix.de>, luojiaxing@...wei.com
Subject: Re: [PATCH v3 2/2] irqchip/gic-v3-its: Balance initial LPI affinity
 across CPUs

On 2020-03-17 18:43, John Garry wrote:
>>> 
>>>> +        int this_count = its_read_lpi_count(d, tmp);
>>> 
>>> Not sure if it's intentional, but now there seems to be a subtle
>>> difference to what Thomas described for non-managed interrupts - for
>>> non-managed interrupts, x86 selects the CPU based on the total
>>> interrupt load per CPU (or, more specifically, lowest vector
>>> allocation count), and not just the non-managed load. Or maybe I
>>> misread it.
>> 
>> So far, I'm trying to keep the two allocation paths separate, as the
>> two systems I have access to have very different behaviours: D05 has
>> no managed interrupts to speak of, and my top-secret work machine
>> has almost no unmanaged interrupts, so the two sets are almost
>> completely disjoint.
> 
> Sure, but I'd say that it would be a more common scenario to have a
> mixture of both.
> 
>> 
>> Also, it all depends on the interrupt allocation order, and whether
>> something will rebalance the non-managed interrupts at a later time.
>> At least, these two patches make it easy to alter the placement policy
>> (the behaviour you describe above is a 2 line change).
>> 
>>> Anyway, we can test this now for NVMe with its managed interrupts.
>> 
>> Looking forward to hearing from you!
>> 
> 
> On my D06CS board (128 core), there seems to be something wrong, as
> the q0 affinity mask looks incorrect:
> 
> PCI name is 81:00.0: nvme0n1
> 
> 
>         irq 322, cpu list 69, effective list 69
> 
> 
>         irq 325, cpu list 32-38, effective list 32
> 
> 
>         irq 326, cpu list 39-45, effective list 40
> 
> 
>         irq 327, cpu list 46-51, effective list 47
> 
> 
>         irq 328, cpu list 52-57, effective list 53
> 
> 
>         irq 329, cpu list 58-63, effective list 59


Sorry, can you explain in more detail what you find wrong in this log?
Is it that interrupt 322 has a single CPU affinity instead of a list?

> And something stranger for my colleague Luo Jiaxing, specifically the
> effective affinity:
> 
> PCI name is 85:00.0: nvme2n1
> irq 196, cpu list 0-31, effective list 82

Right, this one we have seen in your other email. Being a non-managed
interrupt, it lands on the closest socket.

> irq 377, cpu list 32-38, effective list 32
> irq 378, cpu list 39-45, effective list 39
> irq 379, cpu list 46-51, effective list 46
> 
> But then v5.6-rc5 vanilla also looks to have this issue when I tested
> on my board:
> 
> john@...ntu:~$ more /proc/irq/322/smp_affinity_list
> 
> 
> 69
> 
> My D06ES (96 core) board looks sensible for the affinity in this
> regard (I did not try vanilla v5.6-rc5, but only with your patches on
> top). I'll need to debug this.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...