[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izNzK4=6AMdACfn9LWqH9GifCL1vVxH1y2DmF9mFZbB72g@mail.gmail.com>
Date: Tue, 15 Oct 2024 01:38:20 +0300
From: Mina Almasry <almasrymina@...gle.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Leon Romanovsky <leonro@...dia.com>, Jakub Kicinski <kuba@...nel.org>,
Christian König <christian.koenig@....com>,
Samiullah Khawaja <skhawaja@...gle.com>, Taehee Yoo <ap420073@...il.com>, davem@...emloft.net,
pabeni@...hat.com, edumazet@...gle.com, netdev@...r.kernel.org,
linux-doc@...r.kernel.org, donald.hunter@...il.com, corbet@....net,
michael.chan@...adcom.com, kory.maincent@...tlin.com, andrew@...n.ch,
maxime.chevallier@...tlin.com, danieller@...dia.com, hengqi@...ux.alibaba.com,
ecree.xilinx@...il.com, przemyslaw.kitszel@...el.com, hkallweit1@...il.com,
ahmed.zaki@...el.com, paul.greenwalt@...el.com, rrameshbabu@...dia.com,
idosch@...dia.com, asml.silence@...il.com, kaiyuanz@...gle.com,
willemb@...gle.com, aleksander.lobakin@...el.com, dw@...idwei.uk,
sridhar.samudrala@...el.com, bcreeley@....com
Subject: Re: [PATCH net-next v3 7/7] bnxt_en: add support for device memory tcp
On Sat, Oct 12, 2024 at 2:42 AM Jason Gunthorpe <jgg@...pe.ca> wrote:
>
> On Fri, Oct 11, 2024 at 10:33:43AM -0700, Mina Almasry wrote:
> > On Thu, Oct 10, 2024 at 6:34 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > >
> > > On Thu, 10 Oct 2024 10:44:38 -0700 Mina Almasry wrote:
> > > > > > I haven't thought the failure of PP_FLAG_DMA_SYNC_DEV
> > > > > > for dmabuf may be wrong.
> > > > > > I think device memory TCP is not related to this flag.
> > > > > > So device memory TCP core API should not return failure when
> > > > > > PP_FLAG_DMA_SYNC_DEV flag is set.
> > > > > > How about removing this condition check code in device memory TCP core?
> > > > >
> > > > > I think we need to invert the check..
> > > > > Mina, WDYT?
> > > >
> > > > On a closer look, my feeling is similar to Taehee,
> > > > PP_FLAG_DMA_SYNC_DEV should be orthogonal to memory providers. The
> > > > memory providers allocate the memory and provide the dma-addr, but
> > > > need not dma-sync the dma-addr, right? The driver can sync the
> > > > dma-addr if it wants and the driver can delegate the syncing to the pp
> > > > via PP_FLAG_DMA_SYNC_DEV if it wants. AFAICT I think the check should
> > > > be removed, not inverted, but I could be missing something.
> > >
> > > I don't know much about dmabuf but it hinges on the question whether
> > > doing DMA sync for device on a dmabuf address is :
> > > - a good thing
> > > - a noop
> > > - a bad thing
> > >
> > > If it's a good thing or a noop - agreed.
> > >
> > > Similar question for the sync for CPU.
> > >
> > > I agree that intuitively it should be all fine. But the fact that dmabuf
> > > has a bespoke API for accessing the memory by the CPU makes me worried
> > > that there may be assumptions about these addresses not getting
> > > randomly fed into the normal DMA API..
> >
> > Sorry I'm also a bit unsure what is the right thing to do here. The
> > code that we've been running in GVE does a dma-sync for cpu
> > unconditionally on RX for dma-buf and non-dmabuf dma-addrs and we
> > haven't been seeing issues. It never does dma-sync for device.
> >
> > My first question is why is dma-sync for device needed on RX path at
> > all for some drivers in the first place? For incoming (non-dmabuf)
> > data, the data is written by the device and read by the cpu, so sync
> > for cpu is really what's needed. Is the sync for device for XDP? Or is
> > it that buffers should be dma-syncd for device before they are
> > re-posted to the NIC?
> >
> > Christian/Jason, sorry quick question: are
> > dma_sync_single_for_{device|cpu} needed or wanted when the dma-addrs
> > come from a dma-buf? Or these dma-addrs to be treated like any other
> > with the normal dma_sync_for_{device|cpu} rules?
>
> Um, I think because dma-buf hacks things up and generates illegal
> scatterlist entries with weird dma_map_resource() addresses for the
> typical P2P case the dma sync API should not be used on those things.
>
> However, there is no way to know if the dma-buf has does this, and
> there are valid case where the scatterlist is not ill formed and the
> sync is necessary.
>
> We are getting soo close to being able to start fixing these API
> issues in dmabuf, I hope next cylce we can begin.. Fingers crossed.
>
> From a CPU architecture perspective you do not need to cache flush PCI
> MMIO BAR memory, and perhaps doing so be might be problematic on some
> arches (???). But you do need to flush normal cachable CPU memory if
> that is in the DMA buf.
>
Thanks Jason. In that case I agree with Jakub we should take in his change here:
https://lore.kernel.org/netdev/20241009170102.1980ed1d@kernel.org/
With this change the driver would delegate dma_sync_for_device to the
page_pool, and the page_pool will skip it altogether for the dma-buf
memory provider.
--
Thanks,
Mina
Powered by blists - more mailing lists