[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yd8fz4bY/aMMk24h@casper.infradead.org>
Date: Wed, 12 Jan 2022 18:37:03 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: linux-kernel@...r.kernel.org, Christoph Hellwig <hch@....de>,
Joao Martins <joao.m.martins@...cle.com>,
John Hubbard <jhubbard@...dia.com>,
Logan Gunthorpe <logang@...tatee.com>,
Ming Lei <ming.lei@...hat.com>, linux-block@...r.kernel.org,
netdev@...r.kernel.org, linux-mm@...ck.org,
linux-rdma@...r.kernel.org, dri-devel@...ts.freedesktop.org,
nvdimm@...ts.linux.dev
Subject: Re: Phyr Starter
On Tue, Jan 11, 2022 at 06:53:06PM -0400, Jason Gunthorpe wrote:
> IOMMU is not common in those cases, it is slow.
>
> So you end up with 16 bytes per entry then another 24 bytes in the
> entirely redundant scatter list. That is now 40 bytes/page for typical
> HPC case, and I can't see that being OK.
Ah, I didn't realise what case you wanted to optimise for.
So, how about this ...
Since you want to get to the same destination as I do (a
16-byte-per-entry dma_addr+dma_len struct), but need to get there sooner
than "make all sg users stop using it wrongly", let's introduce a
(hopefully temporary) "struct dma_range".
But let's go further than that (which only brings us to 32 bytes per
range). For the systems you care about which use an identity mapping,
and have sizeof(dma_addr_t) == sizeof(phys_addr_t), we can simply
point the dma_range pointer to the same memory as the phyr. We just
have to not free it too early. That gets us down to 16 bytes per range,
a saving of 33%.
Powered by blists - more mailing lists