[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cdfecd37-31d7-42d2-a8d8-92008285b42e@huawei.com>
Date: Thu, 19 Sep 2024 19:15:11 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Jesper Dangaard Brouer <hawk@...nel.org>, Ilias Apalodimas
<ilias.apalodimas@...aro.org>
CC: <davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>,
<liuyonglong@...wei.com>, <fanghaiqing@...wei.com>, <zhangkun09@...wei.com>,
Robin Murphy <robin.murphy@....com>, Alexander Duyck
<alexander.duyck@...il.com>, IOMMU <iommu@...ts.linux.dev>, Wei Fang
<wei.fang@....com>, Shenwei Wang <shenwei.wang@....com>, Clark Wang
<xiaoning.wang@....com>, Eric Dumazet <edumazet@...gle.com>, Tony Nguyen
<anthony.l.nguyen@...el.com>, Przemek Kitszel <przemyslaw.kitszel@...el.com>,
Alexander Lobakin <aleksander.lobakin@...el.com>, Alexei Starovoitov
<ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, John Fastabend
<john.fastabend@...il.com>, Saeed Mahameed <saeedm@...dia.com>, Leon
Romanovsky <leon@...nel.org>, Tariq Toukan <tariqt@...dia.com>, Felix Fietkau
<nbd@....name>, Lorenzo Bianconi <lorenzo@...nel.org>, Ryder Lee
<ryder.lee@...iatek.com>, Shayne Chen <shayne.chen@...iatek.com>, Sean Wang
<sean.wang@...iatek.com>, Kalle Valo <kvalo@...nel.org>, Matthias Brugger
<matthias.bgg@...il.com>, AngeloGioacchino Del Regno
<angelogioacchino.delregno@...labora.com>, Andrew Morton
<akpm@...ux-foundation.org>, <imx@...ts.linux.dev>, <netdev@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <intel-wired-lan@...ts.osuosl.org>,
<bpf@...r.kernel.org>, <linux-rdma@...r.kernel.org>,
<linux-wireless@...r.kernel.org>, <linux-arm-kernel@...ts.infradead.org>,
<linux-mediatek@...ts.infradead.org>, <linux-mm@...ck.org>
Subject: Re: [PATCH net 2/2] page_pool: fix IOMMU crash when driver has
already unbound
On 2024/9/19 17:42, Jesper Dangaard Brouer wrote:
>
> On 18/09/2024 19.06, Ilias Apalodimas wrote:
>>> In order not to do the dma unmmapping after driver has already
>>> unbound and stall the unloading of the networking driver, add
>>> the pool->items array to record all the pages including the ones
>>> which are handed over to network stack, so the page_pool can
>>> do the dma unmmapping for those pages when page_pool_destroy()
>>> is called.
>>
>> So, I was thinking of a very similar idea. But what do you mean by
>> "all"? The pages that are still in caches (slow or fast) of the pool
>> will be unmapped during page_pool_destroy().
>
> I really dislike this idea of having to keep track of all outstanding pages.
>
> I liked Jakub's idea of keeping the netdev around for longer.
>
> This is all related to destroying the struct device that have points to
> the DMA engine, right?
Yes, the problem seems to be that when device_del() is called, there is
no guarantee hw behind the 'struct device ' will be usable even if we
call get_device() on it.
>
> Why don't we add an API that allow netdev to "give" struct device to
> page_pool. And then the page_poll will take over when we can safely
> free the stuct device?
By 'allow netdev to "give" struct device to page_pool', does it mean
page_pool become the driver for the device?
If yes, it seems that is similar to jakub's idea, as both seems to stall
the calling of device_del() by not returning when the driver unloading.
If no, it seems that the problem is still existed when the driver for
the device has unbound after device_del() is called.
>
> --Jesper
Powered by blists - more mailing lists