[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZK3MBfPS-3-tJgjO@slm.duckdns.org>
Date: Tue, 11 Jul 2023 11:39:17 -1000
From: Tejun Heo <tj@...nel.org>
To: Geert Uytterhoeven <geert@...ux-m68k.org>
Cc: Lai Jiangshan <jiangshanlai@...il.com>,
"torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
kernel-team@...a.com, Linux PM list <linux-pm@...r.kernel.org>,
DRI Development <dri-devel@...ts.freedesktop.org>,
linux-rtc@...r.kernel.org,
linux-riscv <linux-riscv@...ts.infradead.org>,
netdev <netdev@...r.kernel.org>,
Linux Fbdev development list <linux-fbdev@...r.kernel.org>,
Linux MMC List <linux-mmc@...r.kernel.org>,
"open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers)" <linux-ide@...r.kernel.org>,
Linux-Renesas <linux-renesas-soc@...r.kernel.org>
Subject: Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2
6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE
mechanism)
Hello,
On Tue, Jul 11, 2023 at 04:06:22PM +0200, Geert Uytterhoeven wrote:
> On Tue, Jul 11, 2023 at 3:55 PM Geert Uytterhoeven <geert@...ux-m68k.org> wrote:
> >
> > Hi Tejun,
> >
> > On Fri, May 12, 2023 at 9:54 PM Tejun Heo <tj@...nel.org> wrote:
> > > Workqueue now automatically marks per-cpu work items that hog CPU for too
> > > long as CPU_INTENSIVE, which excludes them from concurrency management and
> > > prevents stalling other concurrency-managed work items. If a work function
> > > keeps running over the thershold, it likely needs to be switched to use an
> > > unbound workqueue.
> > >
> > > This patch adds a debug mechanism which tracks the work functions which
> > > trigger the automatic CPU_INTENSIVE mechanism and report them using
> > > pr_warn() with exponential backoff.
> > >
> > > v2: Drop bouncing through kthread_worker for printing messages. It was to
> > > avoid introducing circular locking dependency but wasn't effective as it
> > > still had pool lock -> wci_lock -> printk -> pool lock loop. Let's just
> > > print directly using printk_deferred().
> > >
> > > Signed-off-by: Tejun Heo <tj@...nel.org>
> > > Suggested-by: Peter Zijlstra <peterz@...radead.org>
> >
> > Thanks for your patch, which is now commit 6363845005202148
> > ("workqueue: Report work funcs that trigger automatic CPU_INTENSIVE
> > mechanism") in v6.5-rc1.
> >
> > I guess you are interested to know where this triggers.
> > I enabled CONFIG_WQ_CPU_INTENSIVE_REPORT=y, and tested
> > the result on various machines...
>
> > OrangeCrab/Linux-on-LiteX-VexRiscV with ht16k33 14-seg display and ssd130xdrmfb:
> >
> > workqueue: check_lifetime hogged CPU for >10000us 4 times, consider
> > switching to WQ_UNBOUND
> > workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 1024
> > times, consider switching to WQ_UNBOUND
> > workqueue: fb_flashcursor hogged CPU for >10000us 128 times,
> > consider switching to WQ_UNBOUND
> > workqueue: ht16k33_seg14_update hogged CPU for >10000us 128 times,
> > consider switching to WQ_UNBOUND
> > workqueue: mmc_rescan hogged CPU for >10000us 128 times, consider
> > switching to WQ_UNBOUND
>
> Got one more after a while:
>
> workqueue: neigh_managed_work hogged CPU for >10000us 4 times,
> consider switching to WQ_UNBOUND
I wonder whether the right thing to do here is somehow scaling the threshold
according to the relative processing power. It's difficult to come up with a
threshold which works well across the latest & fastest and really tiny CPUs.
I'll think about it some more but if you have some ideas, please feel free
to suggest.
Thanks.
--
tejun
Powered by blists - more mailing lists