[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFHkpVXoAP5JtCzQ@lore-desk>
Date: Tue, 17 Jun 2025 23:56:53 +0200
From: Lorenzo Bianconi <lorenzo.bianconi@...hat.com>
To: Ratheesh Kannoth <rkannoth@...vell.com>
Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, linyunsheng@...wei.com
Subject: Re: [RFC]Page pool buffers stuck in App's socket queue
> Hi,
>
> Recently customer faced a page pool leak issue And keeps on gettting following message in
> console.
> "page_pool_release_retry() stalled pool shutdown 1 inflight 60 sec"
>
> Customer runs "ping" process in background and then does a interface down/up thru "ip" command.
>
> Marvell octeotx2 driver does destroy all resources (including page pool allocated for each queue of
> net device) during interface down event. This page pool destruction will wait for all page pool buffers
> allocated by that instance to return to the pool, hence the above message (if some buffers
> are stuck).
>
> In the customer scenario, ping App opens both RAW and RAW6 sockets. Even though Customer ping
> only ipv4 address, this RAW6 socket receives some IPV6 Router Advertisement messages which gets generated
> in their network.
>
> [ 41.643448] raw6_local_deliver+0xc0/0x1d8
> [ 41.647539] ip6_protocol_deliver_rcu+0x60/0x490
> [ 41.652149] ip6_input_finish+0x48/0x70
> [ 41.655976] ip6_input+0x44/0xcc
> [ 41.659196] ip6_sublist_rcv_finish+0x48/0x68
> [ 41.663546] ip6_sublist_rcv+0x16c/0x22c
> [ 41.667460] ipv6_list_rcv+0xf4/0x12c
>
> Those packets will never gets processed. And if customer does a interface down/up, page pool
> warnings will be shown in the console.
>
> Customer was asking us for a mechanism to drain these sockets, as they dont want to kill their Apps.
> The proposal is to have debugfs which shows "pid last_processed_skb_time number_of_packets socket_fd/inode_number"
> for each raw6/raw4 sockets created in the system. and
> any write to the debugfs (any specific command) will drain the socket.
>
> 1. Could you please comment on the proposal ?
> 2. Could you suggest a better way ?
>
> -Ratheesh
Hi,
this problem recall me an issue I had in the past with page_pool
and TCP traffic destroying the pool (not sure if it is still valid):
https://lore.kernel.org/netdev/ZD2HjZZSOjtsnQaf@lore-desk/
Do you have ongoing TCP flows?
Regards,
Lorenzo
>
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists