[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <b8c4be8c-1d67-c16c-570e-d3c883c77ea2@huawei.com>
Date: Thu, 22 Apr 2021 17:10:49 +0100
From: John Garry <john.garry@...wei.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Question on threaded handlers for managed interrupts
Hi Thomas,
I am finding that I can pretty easily trigger a system hang for certain
scenarios with my storage controller.
So I'm getting something like this when running moderately heavy data
throughput:
Starting 6 processes
[70.656622] sched: RT throttling activatedB/s][r=356k,w=0 IOPS][eta
01h:14m:43s]
[ 207.632161] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:ta
01h:12m:26s]
[ 207.638261] rcu: 0-...!: (1 GPs behind)
idle=312/1/0x4000000000000000 softirq=508/512 fqs=0
[ 207.646777] rcu: 1-...!: (1 GPs behind) idle=694/0/0x0
It ends pretty badly - see [0].
The multi-queue storage controller (see [1] for memory refresh, but note
that I can also trigger on PCI device host controller as well) is using
managed interrupts and threaded handlers. Since the threaded handler
uses SCHED_FIFO, aren't we always vulnerable to this situation with the
managed interrupt and threaded handler combo? Would the advice be to
just use irq polling here?
I unsuccessfully tried to trigger the same on NVMe PCI - however I have
only 1x card, so hardly overloading the system.
Thanks,
John
[0]
https://lore.kernel.org/rcu/412926e8-d3e1-3071-8cb9-098a7f49b64c@huawei.com/T/#mbd60463c543e04f87090d89301e1a5f10de958dd
[1]
https://lore.kernel.org/linux-scsi/1606905417-183214-1-git-send-email-john.garry@huawei.com/#t
Powered by blists - more mailing lists