[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAB=BE-RHwqmSRt-RbmuJ4j1bOFqv1DrYD9m-E1H99hYRnTiXLw@mail.gmail.com>
Date: Mon, 12 Aug 2024 11:15:52 -0700
From: Sandeep Dhavale <dhavale@...gle.com>
To: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: MANISH PANDEY <quic_mapa@...cinc.com>, Bart Van Assche <bvanassche@....org>,
Qais Yousef <qyousef@...alina.io>, Christian Loehle <christian.loehle@....com>, axboe@...nel.dk,
mingo@...nel.org, peterz@...radead.org, vincent.guittot@...aro.org,
linux-block@...r.kernel.org, sudeep.holla@....com,
Jaegeuk Kim <jaegeuk@...nel.org>, Christoph Hellwig <hch@...radead.org>, kailash@...gle.com,
tkjos@...gle.com, bvanassche@...gle.com, quic_nitirawa@...cinc.com,
quic_cang@...cinc.com, quic_rampraka@...cinc.com, quic_narepall@...cinc.com,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Regarding patch "block/blk-mq: Don't complete locally if
capacities are different"
Hi Dietmar,
[..]
>
> So the issue for you with commit af550e4c9682 seems to be that those
> completions don't happen on big CPUs (cpu_capacity = 1024) anymore,
> since the condition in blk_mq_complete_need_ipi() (1):
>
> if (!QUEUE_FLAG_SAME_FORCE && cpus_share_cache(cpu, rq->mq_ctx->cpu) &&
> cpus_equal_capacity(cpu, rq->mq_ctx->cpu))
>
> is no longer true if 'rq->mq_ctx->cpu != big CPU' so (1) returns true
> and blk_mq_complete_request_remote() sends an ipi to 'rq->mq_ctx->cpu'.
>
>
> I tried to simulate this with a 6 CPUs aarch64 QEMU tri-gear (3
> different cpu_capacity values) system:
>
> cat /sys/devices/system/cpu/online
> 0-5
>
> # cat /sys/devices/system/cpu/cpu*/cpu_capacity
> 446
> 446
> 871
> 871
> 1024
> 1024
>
> # grep -i virtio /proc/interrupts | while read a b; do grep -aH .
> /proc/irq/${a%:}/smp_affinity; done
> /proc/irq/15/smp_affinity:3f /* block device */
> /proc/irq/16/smp_affinity:3f /* network device */
>
> So you set the block device irq affine to the big CPUs (0x30).
>
> # echo 30 > /proc/irq/15/smp_affinity
>
> And with the patch, you send ipi's in blk_mq_complete_request_remote()
> in case 'rq->mq_ctx->cpu=[0-4]' whereas w/o the patch or the change to:
>
> arch_scale_cpu_capacity(cpu) >=
> arch_scale_cpu_capacity(rq->mq_ctx->cpu) (2)
>
> you would complete the request locally (i.e. on CPU4/5):
>
> gic_handle_irq() -> ... -> handle_irq_event() -> ... -> vm_interrupt()
> -> ... -> virtblk_done() (callback) -> blk_mq_complete_request() ->
> blk_mq_complete_request_remote(), rq->q->mq_ops->complete(rq)
>
> The patch IMHO was introduced to avoid running local when 'local =
> little CPU'. Since you use system knowledge and set IRQ affinity
> explicitly to big CPU's to run local on them, maybe (2) is the way to
> allow both?
Thank you for doing the experiment.
I agree that changing cpus_equal_capacity() with greater than equal
check (this is what Qais had in his v1 patch [1] ) will allow both.
[1] https://lore.kernel.org/all/20240122224220.1206234-1-qyousef@layalina.io/
Thanks,
Sandeep
Powered by blists - more mailing lists