[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5060d75e-46c0-4d29-a334-62c7e9714fa7@t-8ch.de>
Date: Thu, 28 Apr 2022 16:44:47 +0200
From: Thomas Weißschuh <linux@...ssschuh.net>
To: Christoph Hellwig <hch@....de>
Cc: Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...com>,
Sagi Grimberg <sagi@...mberg.me>, linux-kernel@...r.kernel.org,
linux-nvme@...ts.infradead.org
Subject: Re: [PATCH] nvme-pci: fix host memory buffer allocation size
On 2022-04-28 16:36+0200, Christoph Hellwig wrote:
> On Thu, Apr 28, 2022 at 12:19:22PM +0200, Thomas Weißschuh wrote:
> > We want to allocate the smallest possible amount of buffers with the
> > largest possible size (1 buffer of size "hmpre").
> >
> > Previously we were allocating as many buffers as possible of the smallest
> > possible size.
> > This also lead to "hmpre" to not be satisifed as not enough buffer slots
> > where available.
> >
> > Signed-off-by: Thomas Weißschuh <linux@...ssschuh.net>
> > ---
> >
> > Also discussed at https://lore.kernel.org/linux-nvme/f94565db-f217-4a56-83c3-c6429807185c@t-8ch.de/
> >
> > drivers/nvme/host/pci.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index 3aacf1c0d5a5..0546523cc20b 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -2090,7 +2090,7 @@ static int __nvme_alloc_host_mem(struct nvme_dev *dev, u64 preferred,
> >
> > static int nvme_alloc_host_mem(struct nvme_dev *dev, u64 min, u64 preferred)
> > {
> > - u64 min_chunk = min_t(u64, preferred, PAGE_SIZE * MAX_ORDER_NR_PAGES);
> > + u64 min_chunk = max_t(u64, preferred, PAGE_SIZE * MAX_ORDER_NR_PAGES);
>
> preferred is based on the HMPRE field in the spec, which documents the
> preffered size. So the max here would not make ny sense at all.
Is the current code supposed to reach HMPRE? It does not for me.
The code tries to allocate memory for HMPRE in chunks.
The best allocation would be to allocate one chunk for all of HMPRE.
If this fails we half the chunk size on each iteration and try again.
On my hardware we start with a chunk_size of 4MiB and just allocate
8 (hmmaxd) * 4 = 32 MiB which is worse than 1 * 200MiB.
What am I missing?
Powered by blists - more mailing lists