[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c93e529d-b688-9910-50c4-779c2f85fbc3@huawei.com>
Date: Tue, 16 Aug 2022 21:44:33 +0100
From: John Garry <john.garry@...wei.com>
To: Damien Le Moal <damien.lemoal@...nsource.wdc.com>,
Oliver Sang <oliver.sang@...el.com>
CC: Christoph Hellwig <hch@....de>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
LKML <linux-kernel@...r.kernel.org>,
"Linux Memory Management List" <linux-mm@...ck.org>,
<linux-ide@...r.kernel.org>, <lkp@...ts.01.org>, <lkp@...el.com>,
<ying.huang@...el.com>, <feng.tang@...el.com>,
<zhengjun.xing@...ux.intel.com>, <fengwei.yin@...el.com>
Subject: Re: [ata] 0568e61225: stress-ng.copy-file.ops_per_sec -15.0%
regression
On 16/08/2022 21:02, Damien Le Moal wrote:
>> ou confirm this? Thanks!
>>
>> On this basis, it appears that max_hw_sectors_kb is getting capped from
>> scsi default @ 1024 sectors by commit 0568e61225. If it were getting
>> capped by swiotlb mapping limit then that would give us 512 sectors -
>> this value is fixed.
>>
>> So for my SHT change proposal I am just trying to revert to previous
>> behaviour in 5.19 - make max_hw_sectors_kb crazy big again.
> I reread the entire thing and I think I got things reverted here. The perf
> regression happens with the 512/512 settings, while the original 1280/32767
> before your patches was OK.
Right, that's as I read it. It would be useful for Oliver to confirm the
results.
> So is your patch defining the optimal mapping size
> cause the reduction to 512/512.
The optimal mapping size only affects specifically sas controllers, so I
think that we can ignore that one for now. The reduction to 512/512
comes from the change in ata_scsi_dev_config().
> It would mean that for ATA, we need a sane
> default mapping instead of SCSI default 1024 sectors.
Right
> Now I understand your
> proposed change using ATA_MAX_SECTORS_LBA48.
>
> However, that would be correct only for LBA48 capable drives.
> ata_dev_configure() already sets dev->max_sectors correctly according to the
> drive type, capabilities and eventual quirks. So the problem comes from the
> libata-scsi change:
>
> dev->max_sectors = min(dev->max_sectors, sdev->host->max_sectors);
>
> when sdev->host->max_sectors is 0 (not initialized).
That cannot happen. If sht->max_sectors is 0, then we set
shost->max_sectors at SCSI default 1024 sectors in scsi_host_alloc()
For my proposed change, dev->max_sectors would still be initialized in
ata_dev_configure() according to drive type, etc. And it should be <=
LBA48 max sectors (=65535).
So then in ata_scsi_dev_config():
dev->max_sectors = min(dev->max_sectors, sdev->host->max_sectors)
this only has an impact for ahci controllers if sdev->host->max_sectors
was capped according to host dma dev max mapping size.
I will admit that this is not ideal. An alt approach is to change
ata_scsi_dev_config() to cap the dev max_sectors only according to shost
dma dev mapping limit (similar to scsi_add_host_with_dma()), but that
would not work for a controller like ipr, which does have a geniune
max_sectors limit (which we should respect).
Thanks,
John
> So maybe simply changing
> this line to:
>
> dev->max_sectors = min_not_zero(dev->max_sectors, sdev->host->max_sectors);
>
> would do the trick ? Any particular adapter driver that needs a mapping cap on
> the adpter max mapping size can still set sdev->host->max_sectors as needed, and
> we keep the same defaults as before when it is not set. Thoughts ? Or am I
> missing something else ?
>
>
>>> The regression may come not from commands becoming tiny, but from the fact that
>>> after the patch, max_sectors_kb is too large,
>> I don't think it is, but need confirmation.
>>
>>> causing a lot of overhead with
>>> qemu swiotlb mapping and slowing down IO processing.
>>> Above, it can be seen that we ed up with max_sectors_kb being 1280, which is the
>>> default for most scsi disks (including ATA drives). That is normal. But before
>>> that, it was 512, which likely better fits qemu swiotlb and does not generate
>> Again, I don't think this this is the case. Need confirmation.
>>
>>> overhead. So the above fix will not change anything I think...
Powered by blists - more mailing lists