lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izNzK4=6AMdACfn9LWqH9GifCL1vVxH1y2DmF9mFZbB72g@mail.gmail.com>
Date: Tue, 15 Oct 2024 01:38:20 +0300
From: Mina Almasry <almasrymina@...gle.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Leon Romanovsky <leonro@...dia.com>, Jakub Kicinski <kuba@...nel.org>, 
	Christian König <christian.koenig@....com>, 
	Samiullah Khawaja <skhawaja@...gle.com>, Taehee Yoo <ap420073@...il.com>, davem@...emloft.net, 
	pabeni@...hat.com, edumazet@...gle.com, netdev@...r.kernel.org, 
	linux-doc@...r.kernel.org, donald.hunter@...il.com, corbet@....net, 
	michael.chan@...adcom.com, kory.maincent@...tlin.com, andrew@...n.ch, 
	maxime.chevallier@...tlin.com, danieller@...dia.com, hengqi@...ux.alibaba.com, 
	ecree.xilinx@...il.com, przemyslaw.kitszel@...el.com, hkallweit1@...il.com, 
	ahmed.zaki@...el.com, paul.greenwalt@...el.com, rrameshbabu@...dia.com, 
	idosch@...dia.com, asml.silence@...il.com, kaiyuanz@...gle.com, 
	willemb@...gle.com, aleksander.lobakin@...el.com, dw@...idwei.uk, 
	sridhar.samudrala@...el.com, bcreeley@....com
Subject: Re: [PATCH net-next v3 7/7] bnxt_en: add support for device memory tcp

On Sat, Oct 12, 2024 at 2:42 AM Jason Gunthorpe <jgg@...pe.ca> wrote:
>
> On Fri, Oct 11, 2024 at 10:33:43AM -0700, Mina Almasry wrote:
> > On Thu, Oct 10, 2024 at 6:34 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > >
> > > On Thu, 10 Oct 2024 10:44:38 -0700 Mina Almasry wrote:
> > > > > > I haven't thought the failure of PP_FLAG_DMA_SYNC_DEV
> > > > > > for dmabuf may be wrong.
> > > > > > I think device memory TCP is not related to this flag.
> > > > > > So device memory TCP core API should not return failure when
> > > > > > PP_FLAG_DMA_SYNC_DEV flag is set.
> > > > > > How about removing this condition check code in device memory TCP core?
> > > > >
> > > > > I think we need to invert the check..
> > > > > Mina, WDYT?
> > > >
> > > > On a closer look, my feeling is similar to Taehee,
> > > > PP_FLAG_DMA_SYNC_DEV should be orthogonal to memory providers. The
> > > > memory providers allocate the memory and provide the dma-addr, but
> > > > need not dma-sync the dma-addr, right? The driver can sync the
> > > > dma-addr if it wants and the driver can delegate the syncing to the pp
> > > > via PP_FLAG_DMA_SYNC_DEV if it wants. AFAICT I think the check should
> > > > be removed, not inverted, but I could be missing something.
> > >
> > > I don't know much about dmabuf but it hinges on the question whether
> > > doing DMA sync for device on a dmabuf address is :
> > >  - a good thing
> > >  - a noop
> > >  - a bad thing
> > >
> > > If it's a good thing or a noop - agreed.
> > >
> > > Similar question for the sync for CPU.
> > >
> > > I agree that intuitively it should be all fine. But the fact that dmabuf
> > > has a bespoke API for accessing the memory by the CPU makes me worried
> > > that there may be assumptions about these addresses not getting
> > > randomly fed into the normal DMA API..
> >
> > Sorry I'm also a bit unsure what is the right thing to do here. The
> > code that we've been running in GVE does a dma-sync for cpu
> > unconditionally on RX for dma-buf and non-dmabuf dma-addrs and we
> > haven't been seeing issues. It never does dma-sync for device.
> >
> > My first question is why is dma-sync for device needed on RX path at
> > all for some drivers in the first place? For incoming (non-dmabuf)
> > data, the data is written by the device and read by the cpu, so sync
> > for cpu is really what's needed. Is the sync for device for XDP? Or is
> > it that buffers should be dma-syncd for device before they are
> > re-posted to the NIC?
> >
> > Christian/Jason, sorry quick question: are
> > dma_sync_single_for_{device|cpu} needed or wanted when the dma-addrs
> > come from a dma-buf? Or these dma-addrs to be treated like any other
> > with the normal dma_sync_for_{device|cpu} rules?
>
> Um, I think because dma-buf hacks things up and generates illegal
> scatterlist entries with weird dma_map_resource() addresses for the
> typical P2P case the dma sync API should not be used on those things.
>
> However, there is no way to know if the dma-buf has does this, and
> there are valid case where the scatterlist is not ill formed and the
> sync is necessary.
>
> We are getting soo close to being able to start fixing these API
> issues in dmabuf, I hope next cylce we can begin.. Fingers crossed.
>
> From a CPU architecture perspective you do not need to cache flush PCI
> MMIO BAR memory, and perhaps doing so be might be problematic on some
> arches (???). But you do need to flush normal cachable CPU memory if
> that is in the DMA buf.
>

Thanks Jason. In that case I agree with Jakub we should take in his change here:

https://lore.kernel.org/netdev/20241009170102.1980ed1d@kernel.org/

With this change the driver would delegate dma_sync_for_device to the
page_pool, and the page_pool will skip it altogether for the dma-buf
memory provider.

--
Thanks,
Mina

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ