[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8734rhvlr2.wl-tiwai@suse.de>
Date: Fri, 19 Apr 2024 09:45:37 +0200
From: Takashi Iwai <tiwai@...e.de>
To: Harshit Mogalapalli <harshit.m.mogalapalli@...cle.com>
Cc: Takashi Iwai <tiwai@...e.de>,
Helge Deller <deller@....de>,
Nam Cao <namcao@...utronix.de>,
Thomas Zimmermann <tzimmermann@...e.de>,
Daniel Vetter <daniel@...ll.ch>,
linux-fbdev@...r.kernel.org,
dri-devel@...ts.freedesktop.org,
bigeasy@...utronix.de,
patrik.r.jakobsson@...il.com,
LKML <linux-kernel@...r.kernel.org>,
Vegard Nossum <vegard.nossum@...cle.com>,
George Kennedy <george.kennedy@...cle.com>,
Darren Kenny <darren.kenny@...cle.com>,
chuansheng.liu@...el.com
Subject: Re: [bug-report] task info hung problem in fb_deferred_io_work()
On Fri, 19 Apr 2024 09:39:09 +0200,
Harshit Mogalapalli wrote:
>
> Hi Takashi,
>
> On 19/04/24 12:14, Takashi Iwai wrote:
> > On Thu, 18 Apr 2024 21:29:57 +0200,
> > Helge Deller wrote:
> >>
> >> On 4/18/24 16:26, Takashi Iwai wrote:
> >>> On Thu, 18 Apr 2024 16:06:52 +0200,
> >>> Nam Cao wrote:
> >>>>
> >>>> On 2024-04-18 Harshit Mogalapalli wrote:
> >>>>> While fuzzing 5.15.y kernel with Syzkaller, we noticed a INFO: task hung
> >>>>> bug in fb_deferred_io_work()
> >>>>
> >>>> Which framebuffer device are you using exactly? It is possible that
> >>>> the problem is with the device driver, not core framebuffer.
> >>>
> >>> Note that it was already known that using flush_delayed_work() caused
> >>> a problem. See the thread of the fix patch:
> >>> https://lore.kernel.org/all/20230129082856.22113-1-tiwai@suse.de/
> >>
> >> Harshit reported the hung tasks with kernel v5.15-stable, and can even reproduce
> >> that issue with kernel v6.9-rc4 although it has all of your patches from
> >> that referenced mail thread applied.
> >> So, what does your statement that "it was already known that it causes problems" exactly mean?
> >> Can it be fixed? Is someone looking into fixing it?
> >
> > My original fix was intentionally with cancel_delayed_work_sync()
> > because flush_delayed_work() didn't work. We knew that it'd miss some
> > last-minute queued change, but it's better than crash, so it was
> > applied in that way.
> >
>
> Thanks for sharing these details.
>
> > Then later on, the commit 33cd6ea9c067 changed cancel_*() to
> > flush_delayed_work() blindly, and the known problem resurfaced again.
> >
>
> I have reverted that commit, but still could see some other task hung
> message as shared here on other reply:
>
> https://lore.kernel.org/all/d2485cb9-277d-4b8e-9794-02f1efababc9@oracle.com/
Yes, then it could be a different cause, I suppose.
The crash with flush_delayed_work() was a real crash, no hanging task,
IIRC.
Can you reproduce the issue with the latest Linus upstream, too?
thanks,
Takashi
Powered by blists - more mailing lists