lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c297d618-1378-f51b-45db-605a3fc15336@huawei.com>
Date:   Fri, 23 Apr 2021 13:02:51 +0100
From:   John Garry <john.garry@...wei.com>
To:     Thomas Gleixner <tglx@...utronix.de>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Marc Zyngier" <maz@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        "Ingo Molnar" <mingo@...nel.org>
Subject: Re: Question on threaded handlers for managed interrupts

On 23/04/2021 11:50, Thomas Gleixner wrote:

Hi Thomas,

>> The multi-queue storage controller (see [1] for memory refresh, but
>> note that I can also trigger on PCI device host controller as well) is
>> using managed interrupts and threaded handlers. Since the threaded
>> handler uses SCHED_FIFO, aren't we always vulnerable to this situation
>> with the managed interrupt and threaded handler combo? Would the
>> advice be to just use irq polling here?
> This is a really good question. Most interrupt handlers are not running
> exceedingly long or come in with high frequency, but of course this
> problem exists.
> 
> The network people have solved it with NAPI which disables the interrupt
> in the device and polls it from softirq context (which might be then
> delegated to ksoftirqd) until it's drained.
> 
> I'm not familiar with the block/multiqueue layer to be able to tell
> whether such a concept exists there as well.
> 

Other MQ SCSI drivers have had similar problems. They were handling all 
completion interrupts in hard irq context, the handlers would not exit 
for high throughput scenarios, and they were then getting lockups.

Their solution was to switch over to using irq_poll for when per-queue 
completions got above a certain rate.

> OTOH, the way how you splitted the handling into hard/thread context
> provides already the base for this.
>

Right, so I could switch to a similar scheme, above, but just think that 
what I have in using a threaded handler would already suffice.

> The missing piece is infrastructure at the irq/scheduler core level to
> handle this transparently.
> 
> I have some horrible ideas how to solve that, but I'm sure the scheduler
> wizards can come up with a reasonable and generic solution.

That would be great.

Thanks,
John

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ