lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <04169545-c40d-4ae1-8fd9-39e4c47472a4@kernel.org>
Date: Fri, 5 Sep 2025 09:59:18 +0200
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Vincent Li <vincent.mc.li@...il.com>, Dragos Tatulea <dtatulea@...dia.com>
Cc: netdev@...r.kernel.org, xdp-newbies@...r.kernel.org,
 loongarch@...ts.linux.dev, Furong Xu <0x1207@...il.com>,
 Maxime Coquelin <mcoquelin.stm32@...il.com>,
 Alexandre Torgue <alexandre.torgue@...s.st.com>,
 Huacai Chen <chenhuacai@...nel.org>, Jakub Kicinski <kuba@...nel.org>,
 Mina Almasry <almasrymina@...gle.com>, Philipp Stanner <phasta@...nel.org>,
 Ilias Apalodimas <ilias.apalodimas@...aro.org>,
 Qunqin Zhao <zhaoqunqin@...ngson.cn>, Yanteng Si <si.yanteng@...ux.dev>,
 Andrew Lunn <andrew+netdev@...n.ch>, Toke Høiland-Jørgensen <toke@...nel.org>
Subject: Re: [BUG?] driver stmmac reports page_pool_release_retry() stalled
 pool shutdown every minute



On 02/09/2025 01.27, Vincent Li wrote:
> On Mon, Sep 1, 2025 at 10:56 AM Vincent Li <vincent.mc.li@...il.com> wrote:
>>
>> On Mon, Sep 1, 2025 at 2:23 AM Jesper Dangaard Brouer <hawk@...nel.org> wrote:
>>>
>>> On 01/09/2025 04.47, Vincent Li wrote:
>>>> Hi,
>>>>
>>>> I noticed once I attached a XDP program to a dwmac-loongson-pci
>>>> network device on a loongarch PC, the kernel logs stalled pool message
>>>> below every minute, it seems  not to affect network traffic though. it
>>>> does not seem to be architecture dependent, so I decided to report
>>>> this to netdev and XDP mailing list in case there is a bug in stmmac
>>>> related network device with XDP.
>>>>
>>>
>>> Dragos (Cc'ed) gave a very detailed talk[1] about debugging page_pool
>>> leaks, that I highly recommend:
>>>    [1]
>>> https://netdevconf.info/0x19/sessions/tutorial/diagnosing-page-pool-leaks.html
>>>
>>> Before doing kernel debugging with drgn, I have some easier steps, I
>>> want you to perform on your hardware (I cannot reproduce given I don't
>>> have this hardware).
>>
>> I watched the video and slide, I would have difficulty running drgn
>> since the loongfire OS [0] I am running does not have proper python
>> support. loongfire is a port of IPFire for LoongArch architecture. The
>> kernel is upstream stable release 6.15.9  with a backport of LoongArch
>> BPF trampoline for supporting xdp-tools. I run loongfire on a
>> LoongArch PC for my home Internet. I tried to reproduce this issue on
>> the LoongArch PC with a Fedora desktop OS release with the same kernel
>> 6.15.9, I can't reproduce the issue, not sure if this is only
>> reproducible for firewall/router like Linux OS with stmmac device.
>>
>>>
>>> First step is to check is a socket have unprocessed packets stalled in
>>> it receive-queue (Recv-Q).  Use command 'netstat -tapenu' and look at
>>> column "Recv-Q".  If any socket/application have not emptied it's Recv-Q
>>> try to restart this service and see if the "stalled pool shutdown" goes
>>> away.
>>
>> the Recv-Q shows 0 from  'netstat -tapenu'
>>

This tell us that is wasn't an easy case of packets waiting in a socket
queue.  Indicating a higher probability of a driver issue.

>>   [root@...ngfire ~]#  netstat -tapenu
>> Active Internet connections (servers and established)
>> Proto Recv-Q Send-Q Local Address           Foreign Address
>> State       User       Inode      PID/Program name
>> tcp        0      0 127.0.0.1:8953          0.0.0.0:*
>> LISTEN      0          10283      1896/unbound
>> tcp        0      0 0.0.0.0:53              0.0.0.0:*
>> LISTEN      0          10281      1896/unbound
>> tcp        0      0 0.0.0.0:22              0.0.0.0:*
>> LISTEN      0          8708       2823/sshd: /usr/sbi
>> tcp        0    272 192.168.9.1:22          192.168.9.13:58660
>> ESTABLISHED 0          8754       3004/sshd-session:
>> tcp6       0      0 :::81                   :::*
>> LISTEN      0          7828       2841/httpd
>> tcp6       0      0 :::444                  :::*
>> LISTEN      0          7832       2841/httpd
>> tcp6       0      0 :::1013                 :::*
>> LISTEN      0          7836       2841/httpd
>> tcp6       0      0 10.0.0.229:444          192.168.9.13:58762
>> TIME_WAIT   0          0          -
>> udp        0      0 0.0.0.0:53              0.0.0.0:*
>>           0          10280      1896/unbound
>> udp        0      0 0.0.0.0:67              0.0.0.0:*
>>           0          10647      2803/dhcpd
>> udp        0      0 10.0.0.229:68           0.0.0.0:*
>>           0          8644       2659/dhcpcd: [BOOTP
>> udp        0      0 10.0.0.229:123          0.0.0.0:*
>>           0          8679       2757/ntpd
>> udp        0      0 192.168.9.1:123         0.0.0.0:*
>>           0          8678       2757/ntpd
>> udp        0      0 127.0.0.1:123           0.0.0.0:*
>>           0          8677       2757/ntpd
>> udp        0      0 0.0.0.0:123             0.0.0.0:*
>>           0          8670       2757/ntpd
>> udp        0      0 0.0.0.0:514             0.0.0.0:*
>>           0          5689       1864/syslogd
>> udp6       0      0 :::123                  :::*
>>           0          8667       2757/ntpd
>>
>>> Second step is compiling kernel with CONFIG_DEBUG_VM enabled. This will
>>> warn us if the driver leaked the a page_pool controlled page, without
>>> first "releasing" is correctly.  See commit dba1b8a7ab68 ("mm/page_pool:
>>> catch page_pool memory leaks") for how the warning will look like.
>>>    (p.s. this CONFIG_DEBUG_VM have surprisingly low-overhead, as long as
>>> you don't select any sub-options, so we choose to run with this in
>>> production).
>>>
>>
>> I added CONFIG_DEBUG_VM and recompiled the kernel, but no kernel
>> warning message about page leak, maybe false positive?
>>

This just tells us that the inflight page_pool page wasn't "illegality"
returned to the MM-subsystem.  So, this page is stuck somewhere in the
system, still "registered" to a page_pool instance. This is even more
indication of a driver bug.

We are almost out of easy options to try.  The last attempt I want you
to try is to unload the NIC drivers kernel module (via rmmod).  And then
wait to see if the "stalled pool shutdown" messages disappears. I hope
you have some serial console, so you can still observe the kernel log.

If the "stalled pool shutdown" messages continue, then we have to use
the techniques as Dragos did.

Basically scanning all page's in memory looking for PP_SIGNATURE bit.
Here is some example[1] code that walks all memory pages from the kernel
side.  This doesn't actually work as a kernel module... if I was you, I
would just copy-paste this into the driver or page_pool, and call it
when we see the stalled messages.  This will help us identify the
page_pool page. (After which I would use Drgn to investigate the state).

[1] 
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/bench/page_bench06_walk_all.c#L68-L97


>> [root@...ngfire ~]# grep 'CONFIG_DEBUG_VM=y' /boot/config-6.15.9-ipfire
>>
>> CONFIG_DEBUG_VM=y
>>
>> [root@...ngfire ~]# grep -E 'MEM_TYPE_PAGE_POOL|stalled' /var/log/kern.log
>>
>> Sep  1 10:23:19 loongfire kernel: [    7.484986] dwmac-loongson-pci
>> 0000:00:03.0 green0: Register MEM_TYPE_PAGE_POOL RxQ-0
>> Sep  1 10:26:44 loongfire kernel: [  212.514302] dwmac-loongson-pci
>> 0000:00:03.0 green0: Register MEM_TYPE_PAGE_POOL RxQ-0
>> Sep  1 10:27:44 loongfire kernel: [  272.911878]
>> page_pool_release_retry() stalled pool shutdown: id 9, 1 inflight 60
>> sec
>> Sep  1 10:28:44 loongfire kernel: [  333.327876]
>> page_pool_release_retry() stalled pool shutdown: id 9, 1 inflight 120
>> sec
>> Sep  1 10:29:45 loongfire kernel: [  393.743877]
>> page_pool_release_retry() stalled pool shutdown: id 9, 1 inflight 181
>> sec
>>
> 
> I came up a fentry bpf program [0]
> https://github.com/vincentmli/loongfire/issues/3 to trace the netdev
> value in page_pool_release_retry():
> 
>          /* Periodic warning for page pools the user can't see */
>          netdev = READ_ONCE(pool->slow.netdev);
>          if (time_after_eq(jiffies, pool->defer_warn) &&
>              (!netdev || netdev == NET_PTR_POISON)) {
>                  int sec = (s32)((u32)jiffies - (u32)pool->defer_start) / HZ;
> 
>                  pr_warn("%s() stalled pool shutdown: id %u, %d
> inflight %d sec\n",
>                          __func__, pool->user.id, inflight, sec);
>                  pool->defer_warn = jiffies + DEFER_WARN_INTERVAL;
>          }
> 
> The bpf program prints netdev  NULL, I wonder if there is left over
> page pool allocated initially by the stmmac driver, and  after
> attaching XDP program, the page pool allocated initially had netdev
> changed to NULL?
> 
> Page Pool: 0x900000010b54f000
>    netdev pointer: 0x0
>    is NULL: YES
>    is NET_PTR_POISON: NO
>    condition (!netdev || netdev == NET_PTR_POISON): TRUE
> 
> Page Pool: 0x900000010b54f000
>    netdev pointer: 0x0
>    is NULL: YES
>    is NET_PTR_POISON: NO
>    condition (!netdev || netdev == NET_PTR_POISON): TRUE
> 
>>> Third step is doing kernel debugging like Dragos did in [1].
>>>
>>> What kernel version are you using?
>>
>> kernel 6.15.9
>>

Nice, that is a very recent kernel.
The above shows us that we are indeed hitting the issue of a "hidden"
page_pool instance (related to the page_pool commit Jakub/Kuba added).


>>>
>>> In kernel v6.8 we (Kuba) silenced some of the cases.  See commit
>>> be0096676e23 ("net: page_pool: mute the periodic warning for visible
>>> page pools").
>>> To Jakub/kuba can you remind us how to use the netlink tools that can
>>> help us inspect the page_pools active on the system?
>>>
>>>
>>>> xdp-filter load green0
>>>>
>>>
>>> Most drivers change memory model and reset the RX rings, when attaching
>>> XDP.  So, it makes sense that the existing page_pool instances (per RXq)
>>> are freed and new allocated.  Revealing any leaked or unprocessed
>>> page_pool pages.
>>>
>>>
>>>> Aug 31 19:19:06 loongfire kernel: [200871.855044] dwmac-loongson-pci 0000:00:03.0 green0: Register MEM_TYPE_PAGE_POOL RxQ-0
>>>> Aug 31 19:19:07 loongfire kernel: [200872.810587] page_pool_release_retry() stalled pool shutdown: id 9, 1 inflight 200399 sec
>>>
>>> It is very weird that a stall time of 200399 sec is reported. This
>>> indicate that this have been happening *before* the xdp-filter was
>>> attached. The uptime "200871.855044" indicate leak happened 472 sec
>>> after booting this system.
>>>
>>
>> Not sure if I pasted the previous log message correctly, but this time
>> the log I pasted should be correct,
>>
>>> Have you seen these dmesg logs before attaching XDP?
>>
>> I didn't see such a log before attaching XDP.
>>

 From above we have established, that it makes sense, as the mentioned
commit would have "blocked it" from being printed.

>>>
>>> This will help us know if this page_pool became "invisible" according to
>>> Kuba's change, if you run kernel >= v6.8.
>>>
>>>
>>>> Aug 31 19:20:07 loongfire kernel: [200933.226488] page_pool_release_retry() stalled pool shutdown: id 9, 1 inflight 200460 sec
>>>> Aug 31 19:21:08 loongfire kernel: [200993.642391]
>>>> page_pool_release_retry() stalled pool shutdown: id 9, 1 inflight
>>>> 200520 sec
>>>> Aug 31 19:22:08 loongfire kernel: [201054.058292]
>>>> page_pool_release_retry() stalled pool shutdown: id 9, 1 inflight
>>>> 200581 sec
>>>>
>>>
>>> Cc'ed some people that might have access to this hardware, can any of
>>> you reproduce?
>>>

Anyone with this hardware?

>>
>> [0]: https://github.com/vincentmli/loongfire

--Jesper

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ