[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190626210018.GB6392@ziepe.ca>
Date: Wed, 26 Jun 2019 18:00:18 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Logan Gunthorpe <logang@...tatee.com>
Cc: Christoph Hellwig <hch@....de>, linux-kernel@...r.kernel.org,
linux-block@...r.kernel.org, linux-nvme@...ts.infradead.org,
linux-pci@...r.kernel.org, linux-rdma@...r.kernel.org,
Jens Axboe <axboe@...nel.dk>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Dan Williams <dan.j.williams@...el.com>,
Sagi Grimberg <sagi@...mberg.me>,
Keith Busch <kbusch@...nel.org>,
Stephen Bates <sbates@...thlin.com>
Subject: Re: [RFC PATCH 00/28] Removing struct page from P2PDMA
On Wed, Jun 26, 2019 at 02:45:38PM -0600, Logan Gunthorpe wrote:
>
>
> On 2019-06-26 2:21 p.m., Jason Gunthorpe wrote:
> > On Wed, Jun 26, 2019 at 12:31:08PM -0600, Logan Gunthorpe wrote:
> >>> we have a hole behind len where we could store flag. Preferably
> >>> optionally based on a P2P or other magic memory types config
> >>> option so that 32-bit systems with 32-bit phys_addr_t actually
> >>> benefit from the smaller and better packing structure.
> >>
> >> That seems sensible. The one thing that's unclear though is how to get
> >> the PCI Bus address when appropriate. Can we pass that in instead of the
> >> phys_addr with an appropriate flag? Or will we need to pass the actual
> >> physical address and then, at the map step, the driver has to some how
> >> lookup the PCI device to figure out the bus offset?
> >
> > I agree with CH, if we go down this path it is a layering violation
> > for the thing injecting bio's into the block stack to know what struct
> > device they egress&dma map on just to be able to do the dma_map up
> > front.
>
> Not sure I agree with this statement. The p2pdma code already *must*
> know and access the pci_dev of the dma device ahead of when it submits
> the IO to know if it's valid to allocate and use P2P memory at all.
I don't think we should make drives do that. What if it got CMB memory
on some other device?
> > For instance we could use a small hash table of the upper phys addr
> > bits, or an interval tree, to do the lookup.
>
> Yes, if we're going to take a hard stance on this. But using an interval
> tree (or similar) is a lot more work for the CPU to figure out these
> mappings that may not be strictly necessary if we could just pass better
> information down from the submitting driver to the mapping driver.
Right, this is coming down to an optimization argument. I think there
are very few cases (Basically yours) where the caller will know this
info, so we need to support the other cases anyhow.
I think with some simple caching this will become negligible for cases
you care about
Jason
Powered by blists - more mailing lists