[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <d2009fca-57db-49e6-a874-e8291c3e27f5@quicinc.com>
Date: Thu, 1 Aug 2024 14:55:31 +0530
From: MANISH PANDEY <quic_mapa@...cinc.com>
To: <qyousef@...alina.io>
CC: <axboe@...nel.dk>, <mingo@...nel.org>, <peterz@...radead.org>,
<vincent.guittot@...aro.org>, <dietmar.eggemann@....com>,
<linux-block@...r.kernel.org>, <sudeep.holla@....com>,
Jaegeuk Kim
<jaegeuk@...nel.org>,
Bart Van Assche <bvanassche@....org>,
Christoph Hellwig
<hch@...radead.org>, <kailash@...gle.com>,
<tkjos@...gle.com>, <dhavale@...gle.com>, <bvanassche@...gle.com>,
<quic_nitirawa@...cinc.com>, <quic_cang@...cinc.com>,
<quic_rampraka@...cinc.com>, <quic_narepall@...cinc.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: Regarding patch "block/blk-mq: Don't complete locally if
capacities are different"
++ adding linux-kernel group
On 7/31/2024 7:16 PM, MANISH PANDEY wrote:
> Hi Qais Yousef,
> Recently we observed below patch has been merged
> https://lore.kernel.org/all/20240223155749.2958009-3-qyousef@layalina.io
>
> This patch is causing performance degradation ~20% in Random IO along
> with significant drop in Sequential IO performance. So we would like to
> revert this patch as it impacts MCQ UFS devices heavily. Though Non MCQ
> devices are also getting impacted due to this.
>
> We have several concerns with the patch
> 1. This patch takes away the luxury of affining best possible cpus from
> device drivers and limits driver to fall in same group of CPUs.
>
> 2. Why can't device driver use irq affinity to use desired CPUs to
> complete the IO request, instead of forcing it from block layer.
>
> 3. Already CPUs are grouped based on LLC, then if a new categorization
> is required ?
>
>> big performance impact if the IO request
>> was done from a CPU with higher capacity but the interrupt is serviced
>> on a lower capacity CPU.
>
> This patch doesn't considers the issue of contention in submission path
> and completion path. Also what if we want to complete the request of
> smaller capacity CPU to Higher capacity CPU?
> Shouldn't a device driver take care of this and allow the vendors to use
> the best possible combination they want to use?
> Does it considers MCQ devices and different SQ<->CQ mappings?
>
>> Without the patch I see the BLOCK softirq always running on little cores
>> (where the hardirq is serviced). With it I can see it running on all
>> cores.
>
> why we can't use echo 2 > rq_affinity to force complete on the same
> group of CPUs from where request was initiated?
> Also why to force vendors to always use SOFTIRQ for completion?
> We should be flexible to either complete the IO request via IPI, HARDIRQ
> or SOFTIRQ.
>
>
> An SoC can have different CPU configuration possible and this patch
> forces a restriction on the completion path. This problem is more worse
> in MCQ devices as we can have different SQ<->CQ mapping.
>
> So we would like to revert the patch. Please let us know if any concerns?
>
> Regards
> Manish Pandey
Powered by blists - more mailing lists