lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <75b16ab0-07c0-41d8-9285-0511a10629f7@gmail.com>
Date: Tue, 15 Oct 2024 15:29:26 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: Mina Almasry <almasrymina@...gle.com>, Jason Gunthorpe <jgg@...pe.ca>
Cc: Leon Romanovsky <leonro@...dia.com>, Jakub Kicinski <kuba@...nel.org>,
 Christian König <christian.koenig@....com>,
 Samiullah Khawaja <skhawaja@...gle.com>, Taehee Yoo <ap420073@...il.com>,
 davem@...emloft.net, pabeni@...hat.com, edumazet@...gle.com,
 netdev@...r.kernel.org, linux-doc@...r.kernel.org, donald.hunter@...il.com,
 corbet@....net, michael.chan@...adcom.com, kory.maincent@...tlin.com,
 andrew@...n.ch, maxime.chevallier@...tlin.com, danieller@...dia.com,
 hengqi@...ux.alibaba.com, ecree.xilinx@...il.com,
 przemyslaw.kitszel@...el.com, hkallweit1@...il.com, ahmed.zaki@...el.com,
 paul.greenwalt@...el.com, rrameshbabu@...dia.com, idosch@...dia.com,
 kaiyuanz@...gle.com, willemb@...gle.com, aleksander.lobakin@...el.com,
 dw@...idwei.uk, sridhar.samudrala@...el.com, bcreeley@....com
Subject: Re: [PATCH net-next v3 7/7] bnxt_en: add support for device memory
 tcp

On 10/14/24 23:38, Mina Almasry wrote:
> On Sat, Oct 12, 2024 at 2:42 AM Jason Gunthorpe <jgg@...pe.ca> wrote:
>>
>> On Fri, Oct 11, 2024 at 10:33:43AM -0700, Mina Almasry wrote:
>>> On Thu, Oct 10, 2024 at 6:34 PM Jakub Kicinski <kuba@...nel.org> wrote:
>>>>
>>>> On Thu, 10 Oct 2024 10:44:38 -0700 Mina Almasry wrote:
>>>>>>> I haven't thought the failure of PP_FLAG_DMA_SYNC_DEV
>>>>>>> for dmabuf may be wrong.
>>>>>>> I think device memory TCP is not related to this flag.
>>>>>>> So device memory TCP core API should not return failure when
>>>>>>> PP_FLAG_DMA_SYNC_DEV flag is set.
>>>>>>> How about removing this condition check code in device memory TCP core?
>>>>>>
>>>>>> I think we need to invert the check..
>>>>>> Mina, WDYT?
>>>>>
>>>>> On a closer look, my feeling is similar to Taehee,
>>>>> PP_FLAG_DMA_SYNC_DEV should be orthogonal to memory providers. The
>>>>> memory providers allocate the memory and provide the dma-addr, but
>>>>> need not dma-sync the dma-addr, right? The driver can sync the
>>>>> dma-addr if it wants and the driver can delegate the syncing to the pp
>>>>> via PP_FLAG_DMA_SYNC_DEV if it wants. AFAICT I think the check should
>>>>> be removed, not inverted, but I could be missing something.
>>>>
>>>> I don't know much about dmabuf but it hinges on the question whether
>>>> doing DMA sync for device on a dmabuf address is :
>>>>   - a good thing
>>>>   - a noop
>>>>   - a bad thing
>>>>
>>>> If it's a good thing or a noop - agreed.
>>>>
>>>> Similar question for the sync for CPU.
>>>>
>>>> I agree that intuitively it should be all fine. But the fact that dmabuf
>>>> has a bespoke API for accessing the memory by the CPU makes me worried
>>>> that there may be assumptions about these addresses not getting
>>>> randomly fed into the normal DMA API..
>>>
>>> Sorry I'm also a bit unsure what is the right thing to do here. The
>>> code that we've been running in GVE does a dma-sync for cpu
>>> unconditionally on RX for dma-buf and non-dmabuf dma-addrs and we
>>> haven't been seeing issues. It never does dma-sync for device.
>>>
>>> My first question is why is dma-sync for device needed on RX path at
>>> all for some drivers in the first place? For incoming (non-dmabuf)
>>> data, the data is written by the device and read by the cpu, so sync
>>> for cpu is really what's needed. Is the sync for device for XDP? Or is
>>> it that buffers should be dma-syncd for device before they are
>>> re-posted to the NIC?
>>>
>>> Christian/Jason, sorry quick question: are
>>> dma_sync_single_for_{device|cpu} needed or wanted when the dma-addrs
>>> come from a dma-buf? Or these dma-addrs to be treated like any other
>>> with the normal dma_sync_for_{device|cpu} rules?
>>
>> Um, I think because dma-buf hacks things up and generates illegal
>> scatterlist entries with weird dma_map_resource() addresses for the
>> typical P2P case the dma sync API should not be used on those things.
>>
>> However, there is no way to know if the dma-buf has does this, and
>> there are valid case where the scatterlist is not ill formed and the
>> sync is necessary.
>>
>> We are getting soo close to being able to start fixing these API
>> issues in dmabuf, I hope next cylce we can begin.. Fingers crossed.
>>
>>  From a CPU architecture perspective you do not need to cache flush PCI
>> MMIO BAR memory, and perhaps doing so be might be problematic on some
>> arches (???). But you do need to flush normal cachable CPU memory if
>> that is in the DMA buf.
>>
> 
> Thanks Jason. In that case I agree with Jakub we should take in his change here:
> 
> https://lore.kernel.org/netdev/20241009170102.1980ed1d@kernel.org/
> 
> With this change the driver would delegate dma_sync_for_device to the
> page_pool, and the page_pool will skip it altogether for the dma-buf
> memory provider.

Requiring ->dma_map should be common to all providers as page pool
shouldn't be dipping to net_iovs figuring out how to map them. However,
looking at this discussion seems that the ->dma_sync concern is devmem
specific and should be discarded by pp providers using dmabufs, i.e. in
devmem.c:mp_dmabuf_devmem_init().

-- 
Pavel Begunkov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ