[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260120152934.2eb16a11@kernel.org>
Date: Tue, 20 Jan 2026 15:29:34 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Leon Hwang <leon.hwang@...ux.dev>
Cc: netdev@...r.kernel.org, Jesper Dangaard Brouer <hawk@...nel.org>, Ilias
Apalodimas <ilias.apalodimas@...aro.org>, Steven Rostedt
<rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>, Mathieu
Desnoyers <mathieu.desnoyers@...icios.com>, "David S . Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Paolo Abeni
<pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
kerneljasonxing@...il.com, lance.yang@...ux.dev, jiayuan.chen@...ux.dev,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org, Leon
Huang Fu <leon.huangfu@...pee.com>
Subject: Re: [PATCH net-next v4] page_pool: Add page_pool_release_stalled
tracepoint
On Tue, 20 Jan 2026 11:16:20 +0800 Leon Hwang wrote:
> I encountered the 'pr_warn()' messages during Mellanox NIC flapping on a
> system using the 'mlx5_core' driver (kernel 6.6). The root cause turned
> out to be an application-level issue: the IBM/sarama “Client SeekBroker
> Connection Leak” [1].
The scenario you are describing matches the situations we run into
at Meta. With the upstream kernel you can find that the pages are
leaking based on stats, and if you care use drgn to locate them
(in the recv queue).
The 6.6 kernel did not have page pool stats. I feel quite odd about
adding more uAPI because someone is running a 2+ years old kernel
and doesn't have access to the already existing facilities.
Powered by blists - more mailing lists