[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250110-isolcpus-io-queues-v5-8-0e4f118680b0@kernel.org>
Date: Fri, 10 Jan 2025 17:26:46 +0100
From: Daniel Wagner <wagi@...nel.org>
To: Jens Axboe <axboe@...nel.dk>, Keith Busch <kbusch@...nel.org>,
Christoph Hellwig <hch@....de>, Sagi Grimberg <sagi@...mberg.me>,
"Michael S. Tsirkin" <mst@...hat.com>
Cc: "Martin K. Petersen" <martin.petersen@...cle.com>,
Thomas Gleixner <tglx@...utronix.de>,
Costa Shulyupin <costa.shul@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
Valentin Schneider <vschneid@...hat.com>, Waiman Long <llong@...hat.com>,
Ming Lei <ming.lei@...hat.com>, Frederic Weisbecker <frederic@...nel.org>,
Mel Gorman <mgorman@...e.de>, Hannes Reinecke <hare@...e.de>,
linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
linux-nvme@...ts.infradead.org, megaraidlinux.pdl@...adcom.com,
linux-scsi@...r.kernel.org, storagedev@...rochip.com,
virtualization@...ts.linux.dev, GR-QLogic-Storage-Upstream@...vell.com,
Daniel Wagner <wagi@...nel.org>
Subject: [PATCH v5 8/9] blk-mq: issue warning when offlining hctx with
online isolcpus
When isolcpus=managed_irq is enabled, and the last housekeeping CPU for
a given hardware context goes offline, there is no CPU left which
handles the IOs anymore. If isolated CPUs mapped to this hardware
context are online and an application running on these isolated CPUs
issue an IO this will lead to stalls.
The kernel will not schedule IO to isolated CPUS thus this avoids IO
stalls.
Thus issue a warning when housekeeping CPUs are offlined for a hardware
context while there are still isolated CPUs online.
Signed-off-by: Daniel Wagner <wagi@...nel.org>
---
block/blk-mq.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 42 insertions(+), 1 deletion(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 2e6132f778fd958aae3cad545e4b3dd623c9c304..43eab0db776d37ffd7eb6c084211b5e05d41a574 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3620,6 +3620,45 @@ static bool blk_mq_hctx_has_requests(struct blk_mq_hw_ctx *hctx)
return data.has_rq;
}
+static void blk_mq_hctx_check_isolcpus_online(struct blk_mq_hw_ctx *hctx, unsigned int cpu)
+{
+ const struct cpumask *hk_mask;
+ int i;
+
+ if (!housekeeping_enabled(HK_TYPE_MANAGED_IRQ))
+ return;
+
+ hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);
+
+ for (i = 0; i < hctx->nr_ctx; i++) {
+ struct blk_mq_ctx *ctx = hctx->ctxs[i];
+
+ if (ctx->cpu == cpu)
+ continue;
+
+ /*
+ * Check if this context has at least one online
+ * housekeeping CPU in this case the hardware context is
+ * usable.
+ */
+ if (cpumask_test_cpu(ctx->cpu, hk_mask) &&
+ cpu_online(ctx->cpu))
+ break;
+
+ /*
+ * The context doesn't have any online housekeeping CPUs
+ * but there might be an online isolated CPU mapped to
+ * it.
+ */
+ if (cpu_is_offline(ctx->cpu))
+ continue;
+
+ pr_warn("%s: offlining hctx%d but there is still an online isolcpu CPU %d mapped to it, IO stalls expected\n",
+ hctx->queue->disk->disk_name,
+ hctx->queue_num, ctx->cpu);
+ }
+}
+
static bool blk_mq_hctx_has_online_cpu(struct blk_mq_hw_ctx *hctx,
unsigned int this_cpu)
{
@@ -3639,8 +3678,10 @@ static bool blk_mq_hctx_has_online_cpu(struct blk_mq_hw_ctx *hctx,
continue;
/* this hctx has at least one online CPU */
- if (this_cpu != cpu)
+ if (this_cpu != cpu) {
+ blk_mq_hctx_check_isolcpus_online(hctx, this_cpu);
return true;
+ }
}
return false;
--
2.47.1
Powered by blists - more mailing lists