netdev - Re: [RFC]Page pool buffers stuck in App's socket queue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHS8izNBNoMfheMbW5_FS1zMHW61BZVzDLHgv0+E0Zn6U=jD-g@mail.gmail.com>
Date: Tue, 17 Jun 2025 14:02:05 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Yunsheng Lin <linyunsheng@...wei.com>
Cc: Ratheesh Kannoth <rkannoth@...vell.com>, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org, davem@...emloft.net, edumazet@...gle.com, 
	kuba@...nel.org, pabeni@...hat.com
Subject: Re: [RFC]Page pool buffers stuck in App's socket queue

On Mon, Jun 16, 2025 at 11:34 PM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>
> On 2025/6/16 16:05, Ratheesh Kannoth wrote:
> > Hi,
> >
> > Recently customer faced a page pool leak issue And keeps on gettting following message in
> > console.
> > "page_pool_release_retry() stalled pool shutdown 1 inflight 60 sec"
> >
> > Customer runs "ping" process in background and then does a interface down/up thru "ip" command.
> >
> > Marvell octeotx2 driver does destroy all resources (including page pool allocated for each queue of
> > net device) during interface down event. This page pool destruction will wait for all page pool buffers
> > allocated by that instance to return to the pool, hence the above message (if some buffers
> > are stuck).
> >
> > In the customer scenario, ping App opens both RAW and RAW6 sockets. Even though Customer ping
> > only ipv4 address, this RAW6 socket receives some IPV6 Router Advertisement messages which gets generated
> > in their network.
> >
> > [   41.643448]  raw6_local_deliver+0xc0/0x1d8
> > [   41.647539]  ip6_protocol_deliver_rcu+0x60/0x490
> > [   41.652149]  ip6_input_finish+0x48/0x70
> > [   41.655976]  ip6_input+0x44/0xcc
> > [   41.659196]  ip6_sublist_rcv_finish+0x48/0x68
> > [   41.663546]  ip6_sublist_rcv+0x16c/0x22c
> > [   41.667460]  ipv6_list_rcv+0xf4/0x12c
> >
> > Those packets will never gets processed. And if customer does a interface down/up, page pool
> > warnings will be shown in the console.
> >
> > Customer was asking us for a mechanism to drain these sockets, as they dont want to kill their Apps.
> > The proposal is to have debugfs which shows "pid  last_processed_skb_time  number_of_packets  socket_fd/inode_number"
> > for each raw6/raw4 sockets created in the system. and
> > any write to the debugfs (any specific command) will drain the socket.
> >
> > 1. Could you please comment on the proposal ?
>
> I would say the above is kind of working around the problem.
> It would be good to fix the Apps or fix the page_pool.
>
> > 2. Could you suggest a better way ?
>
> For fixing the page_pool part, I would be suggesting to keep track
> of all the inflight pages and detach those pages from page_pool when
> page_pool_destroy() is called, the tracking part was [1], unfortunately
> the maintainers seemed to choose an easy way instead of a long term
> direction, see [2].

This is not that accurate IMO. Your patch series and the merged patch
series from Toke does the same thing: both keep track of dma-mapped
pages, so that they can be unmapped at page_pool_destroy time. Toke
just did the tracking in a simpler way that people were willing to
review.

So, if you had a plan to detach pages on page_pool_destroy on top of
your tracking, the exact same plan should work on top of Toke's
tracking. It may be useful to code that and send an RFC if you have
time. It would indeed fix this periodic warning issue.

-- 
Thanks,
Mina