[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240814075603.05f8b0f5@kernel.org>
Date: Wed, 14 Aug 2024 07:56:03 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Yonglong Liu <liuyonglong@...wei.com>
Cc: Yunsheng Lin <linyunsheng@...wei.com>, <netdev@...r.kernel.org>,
<davem@...emloft.net>, <edumazet@...gle.com>, <pabeni@...hat.com>,
<ilias.apalodimas@...aro.org>, Jesper Dangaard Brouer <hawk@...nel.org>,
Alexander Duyck <alexander.duyck@...il.com>
Subject: Re: [RFC net] net: make page pool stall netdev unregistration to
avoid IOMMU crashes
On Wed, 14 Aug 2024 18:09:59 +0800 Yonglong Liu wrote:
> On 2024/8/10 11:57, Jakub Kicinski wrote:
> > On Fri, 9 Aug 2024 14:06:02 +0800 Yonglong Liu wrote:
> >> [ 7724.272853] hns3 0000:7d:01.0: page_pool_release_retry(): eno1v0
> >> stalled pool shutdown: id 553, 82 inflight 6706 sec (hold netdev: 1855491)
> > Alright :( You gotta look around for those 82 pages somehow with drgn.
> > bpftrace+kfunc the work that does the periodic print to get the address
> > of the page pool struct and then look around for pages from that pp.. :(
>
> I spent some time to learn how to use the drgn, and found those page,
> but I think those page
>
> is allocated by the hns3 driver, how to find out who own those page now?
Scan the entire system memory looking for the pointer to this page.
Dump the memory around location which hold that pointer. If you're
lucky the page will be held by an skb, and the memory around it will
look like struct skb_shared_info. If you're less lucky the page is used
by sk_buff for the head and address will not be exact. If you're less
lucky still the page will be directly leaked by the driver, and not
pointed to by anything...
I think the last case is most likely, FWIW.
Powered by blists - more mailing lists