[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <165410d2-98b5-48d2-9e51-6590d014bfd6@kernel.org>
Date: Mon, 17 Feb 2025 10:38:48 +0100
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Jason Xing <kerneljasonxing@...il.com>,
Mina Almasry <almasrymina@...gle.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, horms@...nel.org, ilias.apalodimas@...aro.org,
netdev@...r.kernel.org
Subject: Re: [PATCH net-next v3] page_pool: avoid infinite loop to schedule
delayed worker
On 15/02/2025 00.14, Jason Xing wrote:
> On Sat, Feb 15, 2025 at 4:27 AM Mina Almasry <almasrymina@...gle.com> wrote:
>>
>> On Thu, Feb 13, 2025 at 10:43 PM Jason Xing <kerneljasonxing@...il.com> wrote:
>>>
>>> We noticed the kworker in page_pool_release_retry() was waken
>>> up repeatedly and infinitely in production because of the
>>> buggy driver causing the inflight less than 0 and warning
>>> us in page_pool_inflight()[1].
>>>
>>> Since the inflight value goes negative, it means we should
>>> not expect the whole page_pool to get back to work normally.
>>>
>>> This patch mitigates the adverse effect by not rescheduling
>>> the kworker when detecting the inflight negative in
>>> page_pool_release_retry().
>>>
>>> [1]
>>> [Mon Feb 10 20:36:11 2025] ------------[ cut here ]------------
>>> [Mon Feb 10 20:36:11 2025] Negative(-51446) inflight packet-pages
>>> ...
>>> [Mon Feb 10 20:36:11 2025] Call Trace:
>>> [Mon Feb 10 20:36:11 2025] page_pool_release_retry+0x23/0x70
>>> [Mon Feb 10 20:36:11 2025] process_one_work+0x1b1/0x370
>>> [Mon Feb 10 20:36:11 2025] worker_thread+0x37/0x3a0
>>> [Mon Feb 10 20:36:11 2025] kthread+0x11a/0x140
>>> [Mon Feb 10 20:36:11 2025] ? process_one_work+0x370/0x370
>>> [Mon Feb 10 20:36:11 2025] ? __kthread_cancel_work+0x40/0x40
>>> [Mon Feb 10 20:36:11 2025] ret_from_fork+0x35/0x40
>>> [Mon Feb 10 20:36:11 2025] ---[ end trace ebffe800f33e7e34 ]---
>>> Note: before this patch, the above calltrace would flood the
>>> dmesg due to repeated reschedule of release_dw kworker.
>>>
>>> Signed-off-by: Jason Xing <kerneljasonxing@...il.com>
>>
>> Thanks Jason,
>>
>> Reviewed-by: Mina Almasry <almasrymina@...gle.com>
>
> Thanks for the review.
>
>>
>> When you find the root cause of the driver bug, if you can think of
>> ways to catch it sooner in the page_pool or prevent drivers from
>> triggering it, please do consider sending improvements upstream.
>> Thanks!
>
> Sure, it's exactly what I want to do :)
What driver is this happening in?
--Jesper
Powered by blists - more mailing lists