[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoAn3Je1P-c5=tAB9DNPQyYPEknk98WOZpC0jaPMuDqgnA@mail.gmail.com>
Date: Sat, 15 Feb 2025 07:14:15 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Mina Almasry <almasrymina@...gle.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, horms@...nel.org, hawk@...nel.org,
ilias.apalodimas@...aro.org, netdev@...r.kernel.org
Subject: Re: [PATCH net-next v3] page_pool: avoid infinite loop to schedule
delayed worker
On Sat, Feb 15, 2025 at 4:27 AM Mina Almasry <almasrymina@...gle.com> wrote:
>
> On Thu, Feb 13, 2025 at 10:43 PM Jason Xing <kerneljasonxing@...il.com> wrote:
> >
> > We noticed the kworker in page_pool_release_retry() was waken
> > up repeatedly and infinitely in production because of the
> > buggy driver causing the inflight less than 0 and warning
> > us in page_pool_inflight()[1].
> >
> > Since the inflight value goes negative, it means we should
> > not expect the whole page_pool to get back to work normally.
> >
> > This patch mitigates the adverse effect by not rescheduling
> > the kworker when detecting the inflight negative in
> > page_pool_release_retry().
> >
> > [1]
> > [Mon Feb 10 20:36:11 2025] ------------[ cut here ]------------
> > [Mon Feb 10 20:36:11 2025] Negative(-51446) inflight packet-pages
> > ...
> > [Mon Feb 10 20:36:11 2025] Call Trace:
> > [Mon Feb 10 20:36:11 2025] page_pool_release_retry+0x23/0x70
> > [Mon Feb 10 20:36:11 2025] process_one_work+0x1b1/0x370
> > [Mon Feb 10 20:36:11 2025] worker_thread+0x37/0x3a0
> > [Mon Feb 10 20:36:11 2025] kthread+0x11a/0x140
> > [Mon Feb 10 20:36:11 2025] ? process_one_work+0x370/0x370
> > [Mon Feb 10 20:36:11 2025] ? __kthread_cancel_work+0x40/0x40
> > [Mon Feb 10 20:36:11 2025] ret_from_fork+0x35/0x40
> > [Mon Feb 10 20:36:11 2025] ---[ end trace ebffe800f33e7e34 ]---
> > Note: before this patch, the above calltrace would flood the
> > dmesg due to repeated reschedule of release_dw kworker.
> >
> > Signed-off-by: Jason Xing <kerneljasonxing@...il.com>
>
> Thanks Jason,
>
> Reviewed-by: Mina Almasry <almasrymina@...gle.com>
Thanks for the review.
>
> When you find the root cause of the driver bug, if you can think of
> ways to catch it sooner in the page_pool or prevent drivers from
> triggering it, please do consider sending improvements upstream.
> Thanks!
Sure, it's exactly what I want to do :)
Thanks,
Jason
>
> --
> Thanks,
> Mina
Powered by blists - more mailing lists