[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190625072008.GB30350@lst.de>
Date: Tue, 25 Jun 2019 09:20:08 +0200
From: Christoph Hellwig <hch@....de>
To: Logan Gunthorpe <logang@...tatee.com>
Cc: Christoph Hellwig <hch@....de>, linux-kernel@...r.kernel.org,
linux-block@...r.kernel.org, linux-nvme@...ts.infradead.org,
linux-pci@...r.kernel.org, linux-rdma@...r.kernel.org,
Jens Axboe <axboe@...nel.dk>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Dan Williams <dan.j.williams@...el.com>,
Sagi Grimberg <sagi@...mberg.me>,
Keith Busch <kbusch@...nel.org>,
Jason Gunthorpe <jgg@...pe.ca>,
Stephen Bates <sbates@...thlin.com>
Subject: Re: [RFC PATCH 00/28] Removing struct page from P2PDMA
On Mon, Jun 24, 2019 at 10:07:56AM -0600, Logan Gunthorpe wrote:
> > For one passing a dma_addr_t through the block layer is a layering
> > violation, and one that I think will also bite us in practice.
> > The host physical to PCIe bus address mapping can have offsets, and
> > those offsets absolutely can be different for differnet root ports.
> > So with your caller generated dma_addr_t everything works fine with
> > a switched setup as the one you are probably testing on, but on a
> > sufficiently complicated setup with multiple root ports it can break.
>
> I don't follow this argument. Yes, I understand PCI Bus offsets and yes
> I understand that they only apply beyond the bus they're working with.
> But this isn't *that* complicated and it should be the responsibility of
> the P2PDMA code to sort out and provide a dma_addr_t for. The dma_addr_t
> that's passed through the block layer could be a bus address or it could
> be the result of a dma_map_* request (if the transaction is found to go
> through an RC) depending on the requirements of the devices being used.
You assume all addressing is done by the PCI bus address. If a device
is addressing its own BAR there is no reason to use the PCI bus address,
as it might have much more intelligent schemes (usually bar + offset).
>
> > Also duplicating the whole block I/O stack, including hooks all over
> > the fast path is pretty much a no-go.
>
> There was very little duplicate code in the patch set. (Really just the
> mapping code). There are a few hooks, but in practice not that many if
> we ignore the WARN_ONs. We might be able to work to reduce this further.
> The main hooks are: when we skip bouncing, when we skip integrity prep,
> when we split, and when we map. And the patchset drops the PCI_P2PDMA
> hook when we map. So we're talking about maybe three or four extra ifs
> that would likely normally be fast due to the branch predictor.
And all of those add code to the block layer fast path.
Powered by blists - more mailing lists