lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMeQTsbEjUyOYDAF-kFwTcovLr+8gHQGa27jPkeeJqmLhwbTag@mail.gmail.com>
Date: Fri, 19 Apr 2024 09:50:17 +0200
From: Patrik Jakobsson <patrik.r.jakobsson@...il.com>
To: Takashi Iwai <tiwai@...e.de>
Cc: Harshit Mogalapalli <harshit.m.mogalapalli@...cle.com>, Helge Deller <deller@....de>, 
	Nam Cao <namcao@...utronix.de>, Thomas Zimmermann <tzimmermann@...e.de>, 
	Daniel Vetter <daniel@...ll.ch>, linux-fbdev@...r.kernel.org, 
	dri-devel@...ts.freedesktop.org, bigeasy@...utronix.de, 
	LKML <linux-kernel@...r.kernel.org>, Vegard Nossum <vegard.nossum@...cle.com>, 
	George Kennedy <george.kennedy@...cle.com>, Darren Kenny <darren.kenny@...cle.com>, 
	chuansheng.liu@...el.com
Subject: Re: [bug-report] task info hung problem in fb_deferred_io_work()

On Fri, Apr 19, 2024 at 9:45 AM Takashi Iwai <tiwai@...e.de> wrote:
>
> On Fri, 19 Apr 2024 09:39:09 +0200,
> Harshit Mogalapalli wrote:
> >
> > Hi Takashi,
> >
> > On 19/04/24 12:14, Takashi Iwai wrote:
> > > On Thu, 18 Apr 2024 21:29:57 +0200,
> > > Helge Deller wrote:
> > >>
> > >> On 4/18/24 16:26, Takashi Iwai wrote:
> > >>> On Thu, 18 Apr 2024 16:06:52 +0200,
> > >>> Nam Cao wrote:
> > >>>>
> > >>>> On 2024-04-18 Harshit Mogalapalli wrote:
> > >>>>> While fuzzing 5.15.y kernel with Syzkaller, we noticed a INFO: task hung
> > >>>>> bug in fb_deferred_io_work()
> > >>>>
> > >>>> Which framebuffer device are you using exactly? It is possible that
> > >>>> the problem is with the device driver, not core framebuffer.
> > >>>
> > >>> Note that it was already known that using flush_delayed_work() caused
> > >>> a problem.  See the thread of the fix patch:
> > >>>     https://lore.kernel.org/all/20230129082856.22113-1-tiwai@suse.de/
> > >>
> > >> Harshit reported the hung tasks with kernel v5.15-stable, and can even reproduce
> > >> that issue with kernel v6.9-rc4 although it has all of your patches from
> > >> that referenced mail thread applied.
> > >> So, what does your statement that "it was already known that it causes problems" exactly mean?
> > >> Can it be fixed? Is someone looking into fixing it?
> > >
> > > My original fix was intentionally with cancel_delayed_work_sync()
> > > because flush_delayed_work() didn't work.  We knew that it'd miss some
> > > last-minute queued change, but it's better than crash, so it was
> > > applied in that way.
> > >
> >
> > Thanks for sharing these details.
> >
> > > Then later on, the commit 33cd6ea9c067 changed cancel_*() to
> > > flush_delayed_work() blindly, and the known problem resurfaced again.
> > >
> >
> > I have reverted that commit, but still could see some other task hung
> > message as shared here on other reply:
> >
> > https://lore.kernel.org/all/d2485cb9-277d-4b8e-9794-02f1efababc9@oraclecom/
>
> Yes, then it could be a different cause, I suppose.
> The crash with flush_delayed_work() was a real crash, no hanging task,
> IIRC.

Neither cancel_delayed_work_sync() or flush_delayed_work() prevent new
work from being scheduled after they return. But
cancel_delayed_work_sync() at least makes sure the queue is empty so
the problem becomes less apparent.

Could this explain what we're seeing?

>
> Can you reproduce the issue with the latest Linus upstream, too?
>
>
> thanks,
>
> Takashi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ