linux-kernel - Re: [PATCH RFC 1/1] genirq: Make threaded handler use irq affinity for managed interrupt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a46aaad9e00dbfa5817b3698450ae7be@www.loen.fr>
Date:   Fri, 20 Dec 2019 16:16:31 +0000
From:   Marc Zyngier <maz@...nel.org>
To:     John Garry <john.garry@...wei.com>
Cc:     Ming Lei <ming.lei@...hat.com>, <tglx@...utronix.de>,
        "chenxiang (M)" <chenxiang66@...ilicon.com>,
        <bigeasy@...utronix.de>, <linux-kernel@...r.kernel.org>,
        <hare@...e.com>, <hch@....de>, <axboe@...nel.dk>,
        <bvanassche@....org>, <peterz@...radead.org>, <mingo@...hat.com>
Subject: Re: [PATCH RFC 1/1] genirq: Make threaded handler use irq affinity  for managed interrupt

On 2019-12-20 15:38, John Garry wrote:

> I've already done something experimental for the driver to manage the
> affinity, and performance is generally much better:
>
> 
> https://github.com/hisilicon/kernel-dev/commit/e15bd404ed1086fed44da34ed3bd37a8433688a7
>
> But I still think it's wise to only consider managed interrupts for 
> now.

Sure. We've lived with it so far, we can make it last a bit longer... 
;-)

>>
>>> JFYI, about NVMe CPU lockup issue, there are 2 works on going here:
>>>
>>> 
>>> https://lore.kernel.org/linux-nvme/20191209175622.1964-1-kbusch@kernel.org/T/#t
>>>
>>>
>>> 
>>> https://lore.kernel.org/linux-block/20191218071942.22336-1-ming.lei@redhat.com/T/#t
>>>
>> I've also managed to trigger some of them now that I have access to
>> a decent box with nvme storage.
>
> I only have 2x NVMe SSDs when this occurs - I should not be hitting 
> this...

Same configuration here. And the number of interrupts is pretty
low (less that 20k/s per CPU), so I doubt this is interrupt related.

> Out of curiosity, have you tried
>> with the SMMU disabled? I'm wondering whether we hit some livelock
>> condition on unmapping buffers...
>
> No, but I can give it a try. Doing that should lower the CPU usage,
> though, so maybe masks the issue - probably not.

I wonder whether we could end-up in some form of unmap storm on
completion, with a CPU being starved trying to insert its TLBI
command into the queue.

Anyway, more digging in perspective.

         M.
-- 
Jazz is not dead. It just smells funny...