lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5d9ea7bd-67bb-4a9d-a120-c8f290c31a47@huawei.com>
Date: Tue, 15 Oct 2024 18:52:34 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Jakub Kicinski <kuba@...nel.org>
CC: <davem@...emloft.net>, <pabeni@...hat.com>, <liuyonglong@...wei.com>,
	<fanghaiqing@...wei.com>, <zhangkun09@...wei.com>, Alexander Lobakin
	<aleksander.lobakin@...el.com>, Robin Murphy <robin.murphy@....com>,
	Alexander Duyck <alexander.duyck@...il.com>, IOMMU <iommu@...ts.linux.dev>,
	Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
	Jesper Dangaard Brouer <hawk@...nel.org>, John Fastabend
	<john.fastabend@...il.com>, Matthias Brugger <matthias.bgg@...il.com>,
	AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
	<netdev@...r.kernel.org>, <intel-wired-lan@...ts.osuosl.org>,
	<bpf@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-mediatek@...ts.infradead.org>
Subject: Re: [PATCH net v2 0/2] fix two bugs related to page_pool

On 2024/10/15 8:14, Jakub Kicinski wrote:
> On Sat, 12 Oct 2024 20:05:31 +0800 Yunsheng Lin wrote:
>> 1. Semantics changing of supporting unlimited inflight pages
>>    to limited inflight pages that are as large as the pool_size
>>    of page_pool.
> 
> How can this possibly work?

As a similar comment in [1], do we really need unlimited inflight pages
for the page_pool to work? If we do, it seems there is something really
need fixing here. I am agreed changing of semantics here might introduce
regressions here because there may be some subsystem depending on the
previous semantics or incorrect calculating of how many inflight pages it
is needed, so I am agreed that it might be better to target the net-next
tree to give some cycles of testing before backporting it.

1. https://lore.kernel.org/all/842c8cc6-f716-437a-bc98-70bc26d6fd38@huawei.com/

> 
> The main thing stopping me from reposting my fix that it'd be nice to
> figure out if a real IOMMU device is bound or not. If we don't have

device_iommu_mapped() might be used to check if a real IOMMU device is
bound or not.
I am afraid it is not just about IOMMU here as there might be other
resource behind the dma mapping, like the bounce buffer memory as below:

https://elixir.bootlin.com/linux/v6.7-rc8/source/drivers/iommu/dma-iommu.c#L1204
https://elixir.bootlin.com/linux/v6.7-rc8/source/kernel/dma/direct.h#L125

And we may argue is_swiotlb_active() can be used check if there is any
bounce buffer memory behind the dma mapping as the device_iommu_mapped()
does, but I am not sure if there is any other resource besides iommu and
bounce buffer.

> real per-device mappings we presumably don't have to wait. If we can 

For not having to wait part:
I am not sure if the page_pool_destroy()/__page_pool_release_page_dma()
need to synchronize with arch_teardown_dma_ops() or how to synchronize
with it, as it seems to be called when driver unloading even if we have
ensured that there is no IOMMU or bounce buffer memory behind the device
by the above checking:
__device_release_driver -> device_unbind_cleanup -> arch_teardown_dma_ops

> check this condition we are guaranteed not to introduce regressions,
> since we would be replacing a crash by a wait, which is strictly better.

For the waiting part:
The problem is how much time we need to wait when device_iommu_mapped()
or is_swiotlb_active() return true here, as mentioned in [2], [3]. And
currently the waiting might be infinite as the testing in [4].

> 
> If we'd need to fiddle with too many internals to find out if we have
> to wait - let's just always wait and see if anyone complains.

Does the testing report in [4] classify as someone complaining? As
the driver unloading seems to be stalling forever, and the cause of the
infinite stalling is skb_attempt_defer_free() by debugging as mentioned
in [2].

2. https://lore.kernel.org/all/2c5ccfff-6ab4-4aea-bff6-3679ff72cc9a@huawei.com/
3. https://lore.kernel.org/netdev/d50ac1a9-f1e2-49ee-b89b-05dac9bc6ee1@huawei.com/
4. https://lore.kernel.org/netdev/758b4d47-c980-4f66-b4a4-949c3fc4b040@huawei.com/

> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ