lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220510070356.GA11660@lst.de>
Date:   Tue, 10 May 2022 09:03:56 +0200
From:   Christoph Hellwig <hch@....de>
To:     Thomas Weißschuh <linux@...ssschuh.net>
Cc:     Christoph Hellwig <hch@....de>, Keith Busch <kbusch@...nel.org>,
        Jens Axboe <axboe@...com>, Sagi Grimberg <sagi@...mberg.me>,
        linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org
Subject: Re: [PATCH] nvme-pci: fix host memory buffer allocation size

On Thu, Apr 28, 2022 at 06:09:11PM +0200, Thomas Weißschuh wrote:
> > > On my hardware we start with a chunk_size of 4MiB and just allocate
> > > 8 (hmmaxd) * 4 = 32 MiB which is worse than 1 * 200MiB.
> > 
> > And that is because the hardware only has a limited set of descriptors.
> 
> Wouldn't it make more sense then to allocate as much memory as possible for
> each descriptor that is available?
> 
> The comment in nvme_alloc_host_mem() tries to "start big".
> But it actually starts with at most 4MiB.

Compared to what other operating systems offer, that is quite large.

> And on devices that have hmminds > 4MiB the loop condition will never succeed
> at all and HMB will not be used.
> My fairly boring hardware already is at a hmminds of 3.3MiB.
> 
> > Is there any real problem you are fixing with this?  Do you actually
> > see a performance difference on a relevant workload?
> 
> I don't have a concrete problem or performance issue.
> During some debugging I stumbled in my kernel logs upon
> "nvme nvme0: allocated 32 MiB host memory buffer"
> and investigated why it was so low.

Until recently we could not even support these large sizes at all on
typical x86 configs.  With my fairly recent change to allow vmap
remapped iommu allocations on x86 we can do that now.  But if we
unconditionally enabled it I'd be a little worried about using too
much memory very easily.

We could look into removing the min with
PAGE_SIZE * MAX_ORDER_NR_PAGES to try to do larger segments for
"segment challenged" controllers now that it could work on a lot
of iommu enabled setups.  But I'd rather have a very good reason for
that.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ