[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izOcLnt3SXzfbSA_vqno0R1SaBbXq-U8_LtRv64Bj7tUSQ@mail.gmail.com>
Date: Fri, 14 Feb 2025 12:27:28 -0800
From: Mina Almasry <almasrymina@...gle.com>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, horms@...nel.org, hawk@...nel.org,
ilias.apalodimas@...aro.org, netdev@...r.kernel.org
Subject: Re: [PATCH net-next v3] page_pool: avoid infinite loop to schedule
delayed worker
On Thu, Feb 13, 2025 at 10:43 PM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> We noticed the kworker in page_pool_release_retry() was waken
> up repeatedly and infinitely in production because of the
> buggy driver causing the inflight less than 0 and warning
> us in page_pool_inflight()[1].
>
> Since the inflight value goes negative, it means we should
> not expect the whole page_pool to get back to work normally.
>
> This patch mitigates the adverse effect by not rescheduling
> the kworker when detecting the inflight negative in
> page_pool_release_retry().
>
> [1]
> [Mon Feb 10 20:36:11 2025] ------------[ cut here ]------------
> [Mon Feb 10 20:36:11 2025] Negative(-51446) inflight packet-pages
> ...
> [Mon Feb 10 20:36:11 2025] Call Trace:
> [Mon Feb 10 20:36:11 2025] page_pool_release_retry+0x23/0x70
> [Mon Feb 10 20:36:11 2025] process_one_work+0x1b1/0x370
> [Mon Feb 10 20:36:11 2025] worker_thread+0x37/0x3a0
> [Mon Feb 10 20:36:11 2025] kthread+0x11a/0x140
> [Mon Feb 10 20:36:11 2025] ? process_one_work+0x370/0x370
> [Mon Feb 10 20:36:11 2025] ? __kthread_cancel_work+0x40/0x40
> [Mon Feb 10 20:36:11 2025] ret_from_fork+0x35/0x40
> [Mon Feb 10 20:36:11 2025] ---[ end trace ebffe800f33e7e34 ]---
> Note: before this patch, the above calltrace would flood the
> dmesg due to repeated reschedule of release_dw kworker.
>
> Signed-off-by: Jason Xing <kerneljasonxing@...il.com>
Thanks Jason,
Reviewed-by: Mina Almasry <almasrymina@...gle.com>
When you find the root cause of the driver bug, if you can think of
ways to catch it sooner in the page_pool or prevent drivers from
triggering it, please do consider sending improvements upstream.
Thanks!
--
Thanks,
Mina
Powered by blists - more mailing lists