linux-kernel - Re: Regarding patch "block/blk-mq: Don't complete locally if capacities are different"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <d2009fca-57db-49e6-a874-e8291c3e27f5@quicinc.com>
Date: Thu, 1 Aug 2024 14:55:31 +0530
From: MANISH PANDEY <quic_mapa@...cinc.com>
To: <qyousef@...alina.io>
CC: <axboe@...nel.dk>, <mingo@...nel.org>, <peterz@...radead.org>,
        <vincent.guittot@...aro.org>, <dietmar.eggemann@....com>,
        <linux-block@...r.kernel.org>, <sudeep.holla@....com>,
        Jaegeuk Kim
	<jaegeuk@...nel.org>,
        Bart Van Assche <bvanassche@....org>,
        Christoph Hellwig
	<hch@...radead.org>, <kailash@...gle.com>,
        <tkjos@...gle.com>, <dhavale@...gle.com>, <bvanassche@...gle.com>,
        <quic_nitirawa@...cinc.com>, <quic_cang@...cinc.com>,
        <quic_rampraka@...cinc.com>, <quic_narepall@...cinc.com>,
        <linux-kernel@...r.kernel.org>
Subject: Re: Regarding patch "block/blk-mq: Don't complete locally if
 capacities are different"

++ adding linux-kernel group

On 7/31/2024 7:16 PM, MANISH PANDEY wrote:
> Hi Qais Yousef,
> Recently we observed below patch has been merged
> https://lore.kernel.org/all/20240223155749.2958009-3-qyousef@layalina.io
> 
> This patch is causing performance degradation ~20% in Random IO along 
> with significant drop in Sequential IO performance. So we would like to 
> revert this patch as it impacts MCQ UFS devices heavily. Though Non MCQ 
> devices are also getting impacted due to this.
> 
> We have several concerns with the patch
> 1. This patch takes away the luxury of affining best possible cpus from 
>    device drivers and limits driver to fall in same group of CPUs.
> 
> 2. Why can't device driver use irq affinity to use desired CPUs to 
> complete the IO request, instead of forcing it from block layer.
> 
> 3. Already CPUs are grouped based on LLC, then if a new categorization 
> is required ?
> 
>> big performance impact if the IO request
>> was done from a CPU with higher capacity but the interrupt is serviced
>> on a lower capacity CPU.
> 
> This patch doesn't considers the issue of contention in submission path 
> and completion path. Also what if we want to complete the request of 
> smaller capacity CPU to Higher capacity CPU?
> Shouldn't a device driver take care of this and allow the vendors to use 
> the best possible combination they want to use?
> Does it considers MCQ devices and different SQ<->CQ mappings?
> 
>> Without the patch I see the BLOCK softirq always running on little cores
>> (where the hardirq is serviced). With it I can see it running on all
>> cores.
> 
> why we can't use echo 2 > rq_affinity to force complete on the same
> group of CPUs from where request was initiated?
> Also why to force vendors to always use SOFTIRQ for completion?
> We should be flexible to either complete the IO request via IPI, HARDIRQ 
> or SOFTIRQ.
> 
> 
> An SoC can have different CPU configuration possible and this patch 
> forces a restriction on the completion path. This problem is more worse 
> in MCQ devices as we can have different SQ<->CQ mapping.
> 
> So we would like to revert the patch. Please let us know if any concerns?
> 
> Regards
> Manish Pandey