[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ff66801c-f261-411d-bbbf-b386e013d096@suse.de>
Date: Mon, 8 Sep 2025 08:13:31 +0200
From: Hannes Reinecke <hare@...e.de>
To: Daniel Wagner <wagi@...nel.org>, Jens Axboe <axboe@...nel.dk>,
Keith Busch <kbusch@...nel.org>, Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>, "Michael S. Tsirkin" <mst@...hat.com>
Cc: Aaron Tomlin <atomlin@...mlin.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Thomas Gleixner <tglx@...utronix.de>, Costa Shulyupin
<costa.shul@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
Valentin Schneider <vschneid@...hat.com>, Waiman Long <llong@...hat.com>,
Ming Lei <ming.lei@...hat.com>, Frederic Weisbecker <frederic@...nel.org>,
Mel Gorman <mgorman@...e.de>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
linux-nvme@...ts.infradead.org, megaraidlinux.pdl@...adcom.com,
linux-scsi@...r.kernel.org, storagedev@...rochip.com,
virtualization@...ts.linux.dev, GR-QLogic-Storage-Upstream@...vell.com
Subject: Re: [PATCH v8 10/12] blk-mq: use hk cpus only when isolcpus=io_queue
is enabled
On 9/5/25 16:59, Daniel Wagner wrote:
> Extend the capabilities of the generic CPU to hardware queue (hctx)
> mapping code, so it maps houskeeping CPUs and isolated CPUs to the
> hardware queues evenly.
>
> A hctx is only operational when there is at least one online
> housekeeping CPU assigned (aka active_hctx). Thus, check the final
> mapping that there is no hctx which has only offline housekeeing CPU and
> online isolated CPUs.
>
> Example mapping result:
>
> 16 online CPUs
>
> isolcpus=io_queue,2-3,6-7,12-13
>
> Queue mapping:
> hctx0: default 0 2
> hctx1: default 1 3
> hctx2: default 4 6
> hctx3: default 5 7
> hctx4: default 8 12
> hctx5: default 9 13
> hctx6: default 10
> hctx7: default 11
> hctx8: default 14
> hctx9: default 15
>
> IRQ mapping:
> irq 42 affinity 0 effective 0 nvme0q0
> irq 43 affinity 0 effective 0 nvme0q1
> irq 44 affinity 1 effective 1 nvme0q2
> irq 45 affinity 4 effective 4 nvme0q3
> irq 46 affinity 5 effective 5 nvme0q4
> irq 47 affinity 8 effective 8 nvme0q5
> irq 48 affinity 9 effective 9 nvme0q6
> irq 49 affinity 10 effective 10 nvme0q7
> irq 50 affinity 11 effective 11 nvme0q8
> irq 51 affinity 14 effective 14 nvme0q9
> irq 52 affinity 15 effective 15 nvme0q10
>
> A corner case is when the number of online CPUs and present CPUs
> differ and the driver asks for less queues than online CPUs, e.g.
>
> 8 online CPUs, 16 possible CPUs
>
> isolcpus=io_queue,2-3,6-7,12-13
> virtio_blk.num_request_queues=2
>
> Queue mapping:
> hctx0: default 0 1 2 3 4 5 6 7 8 12 13
> hctx1: default 9 10 11 14 15
>
> IRQ mapping
> irq 27 affinity 0 effective 0 virtio0-config
> irq 28 affinity 0-1,4-5,8 effective 5 virtio0-req.0
> irq 29 affinity 9-11,14-15 effective 0 virtio0-req.1
>
> Noteworthy is that for the normal/default configuration (!isoclpus) the
> mapping will change for systems which have non hyperthreading CPUs. The
> main assignment loop will completely rely that group_mask_cpus_evenly to
> do the right thing. The old code would distribute the CPUs linearly over
> the hardware context:
>
> queue mapping for /dev/nvme0n1
> hctx0: default 0 8
> hctx1: default 1 9
> hctx2: default 2 10
> hctx3: default 3 11
> hctx4: default 4 12
> hctx5: default 5 13
> hctx6: default 6 14
> hctx7: default 7 15
>
> The assign each hardware context the map generated by the
> group_mask_cpus_evenly function:
>
> queue mapping for /dev/nvme0n1
> hctx0: default 0 1
> hctx1: default 2 3
> hctx2: default 4 5
> hctx3: default 6 7
> hctx4: default 8 9
> hctx5: default 10 11
> hctx6: default 12 13
> hctx7: default 14 15
>
> In case of hyperthreading CPUs, the resulting map stays the same.
>
> Signed-off-by: Daniel Wagner <wagi@...nel.org>
> ---
> block/blk-mq-cpumap.c | 177 ++++++++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 158 insertions(+), 19 deletions(-)
>
> diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
> index 8244ecf878358c0b8de84458dcd5100c2f360213..1e66882e4d5bd9f78d132f3a229a1577853f7a9f 100644
> --- a/block/blk-mq-cpumap.c
> +++ b/block/blk-mq-cpumap.c
> @@ -17,12 +17,25 @@
> #include "blk.h"
> #include "blk-mq.h"
>
> +static struct cpumask blk_hk_online_mask;
> +
> static unsigned int blk_mq_num_queues(const struct cpumask *mask,
> unsigned int max_queues)
> {
> unsigned int num;
>
> - num = cpumask_weight(mask);
> + if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) {
> + const struct cpumask *hk_mask;
> + struct cpumask avail_mask;
> +
> + hk_mask = housekeeping_cpumask(HK_TYPE_IO_QUEUE);
> + cpumask_and(&avail_mask, mask, hk_mask);
> +
> + num = cpumask_weight(&avail_mask);
> + } else {
> + num = cpumask_weight(mask);
> + }
> +
> return min_not_zero(num, max_queues);
> }
>
> @@ -31,9 +44,13 @@ static unsigned int blk_mq_num_queues(const struct cpumask *mask,
> *
> * Returns an affinity mask that represents the queue-to-CPU mapping
> * requested by the block layer based on possible CPUs.
> + * This helper takes isolcpus settings into account.
> */
> const struct cpumask *blk_mq_possible_queue_affinity(void)
> {
> + if (housekeeping_enabled(HK_TYPE_IO_QUEUE))
> + return housekeeping_cpumask(HK_TYPE_IO_QUEUE);
> +
> return cpu_possible_mask;
> }
> EXPORT_SYMBOL_GPL(blk_mq_possible_queue_affinity);
> @@ -46,6 +63,12 @@ EXPORT_SYMBOL_GPL(blk_mq_possible_queue_affinity);
> */
> const struct cpumask *blk_mq_online_queue_affinity(void)
> {
> + if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) {
> + cpumask_and(&blk_hk_online_mask, cpu_online_mask,
> + housekeeping_cpumask(HK_TYPE_IO_QUEUE));
> + return &blk_hk_online_mask;
Can you explain the use of 'blk_hk_online_mask'?
Why is a static variable?
To my untrained eye it's being recalculated every time one calls
this function. And only the first invocation run on an empty mask,
all subsequent ones see a populated mask.
Is that the intention?
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@...e.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
Powered by blists - more mailing lists