[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <19307b23f01.fcb555461012595.2202335253480073101@collabora.com>
Date: Thu, 07 Nov 2024 17:35:42 +0000
From: Robert Beckett <bob.beckett@...labora.com>
To: "Keith Busch" <kbusch@...nel.org>
Cc: "Jens Axboe" <axboe@...nel.dk>, "Christoph Hellwig" <hch@....de>,
"Sagi Grimberg" <sagi@...mberg.me>, "kernel" <kernel@...labora.com>,
"linux-nvme" <linux-nvme@...ts.infradead.org>,
"linux-kernel" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] nvme-pci: 512 byte dma pool segment quirk
---- On Thu, 07 Nov 2024 17:19:30 +0000 Keith Busch wrote ---
> On Thu, Nov 07, 2024 at 04:50:46PM +0000, Bob Beckett wrote:
> > @@ -611,7 +612,7 @@ static blk_status_t nvme_pci_setup_prps(struct nvme_dev *dev,
> > }
> >
> > nprps = DIV_ROUND_UP(length, NVME_CTRL_PAGE_SIZE);
> > - if (nprps <= (256 / 8)) {
> > + if (nprps small_dmapool_seg_size / 8)) {
> > pool = dev->prp_small_pool;
> > iod->nr_allocations = 0;
> > } else {
>
> We have a constant expression currently, and this is changing it a full
> division in the IO path. :(
yeah, that's fair. Does it get high enough throughput that this is a significant issue here? (I have little intuition for this driver).
how about pre-computing the nprps threshold during pool creation where we detect the quirk, it would then be variable comparison instead of a const comparison, but no divide?
>
> Could we leave the pool selection check size as-is and just say the cost
> of the quirk is additional memory overhead?
>
> > @@ -2700,8 +2701,9 @@ static int nvme_setup_prp_pools(struct nvme_dev *dev)
> > return -ENOMEM;
> >
> > /* Optimisation for I/Os between 4k and 128k */
> > - dev->prp_small_pool = dma_pool_create("prp list 256", dev->dev,
> > - 256, 256, 0);
> > + dev->prp_small_pool = dma_pool_create("prp list small", dev->dev,
> > + dev->small_dmapool_seg_size,
> > + dev->small_dmapool_seg_size, 0);
>
> I think it should work if we only change the alignment property of the
> pool. Something like this:
>
> if (dev->ctrl.quirks & NVME_QUIRK_SMALL_DMAPOOL_512)
> dev->prp_small_pool = dma_pool_create("prp list 256", dev->dev,
> 256, 512, 0);
I actually already tested a change of 512, 512 while keeping the 256 devision above during testing (i.e. waste half of the segment). I'll confirm with a test again against latest and send a v2 assuming it tests fine.
> else
> dev->prp_small_pool = dma_pool_create("prp list 256", dev->dev,
> 256, 256, 0);
>
Powered by blists - more mailing lists