linux-kernel - Re: [syzbot] INFO: task hung in jbd2_journal_commit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YcD0itBAoOPNNKvX@mit.edu>
Date:   Mon, 20 Dec 2021 16:24:26 -0500
From:   "Theodore Ts'o" <tytso@....edu>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Hillf Danton <hdanton@...a.com>,
        syzbot <syzbot+9c3fb12e9128b6e1d7eb@...kaller.appspotmail.com>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] INFO: task hung in jbd2_journal_commit_transaction (3)

On Sun, Dec 19, 2021 at 04:20:57AM +0000, Matthew Wilcox wrote:
> > Hey Willy
> > >
> > >sched_setattr(0x0, &(0x7f0000000080)={0x38, 0x1, 0x0, 0x0, 0x1}, 0x0)
> > >
> > >so you've set a SCHED_FIFO priority and then are surprised that some
> > >tasks are getting starved?
> > 
> > Can you speficy a bit more on how fifo could block journald waiting for
> > IO completion more than 120 seconds?
> 
> Sure!  You can see from the trace below that jbd2/sda1-8 is in D state,
> so we know nobody's called unlock_buffer() yet, which would have woken
> it.  That would happen in journal_end_buffer_io_sync(), which is
> the b_end_io for the buffer.
> 
> Learning more detail than that would require knowing the I/O path
> for this particular test system.  I suspect that the I/O was submitted
> and has even completed, but there's a kernel thread waiting to run which
> will call the ->b_end_io that hasn't been scheduled yet, because it's
> at a lower priority than all the threads which are running at
> SCHED_FIFO.
> 
> I'm disinclined to look at this report much further because syzbot is
> stumbling around trying things which are definitely in the category of
> "if you do this and things break, you get to keep both pieces".  You
> can learn some interesting things by playing with the various RT
> scheduling classes, but mostly what you can learn is that you need to
> choose your priorities carefully to have a functioning system.

In general, real-time threads (anything scheduled with SCHED_FIFO or
SCHED_RT) should never, *ever* try to do any kind of I/O.  After all,
I/O can block, and if a real-time thread blocks, so much for any kind
of real-time guarantee that you might have.

If you must use do I/O from soft real-time thread, one trick you *can*
do is to some number of CPU's which are reserved for real-time
threads, and a subset of threads which are reserved for non-real-time
threads, enforced using CPU pinning.  It's still not prefect, since
there are still priority inheritance issues, and while this protects
against a non-real-time thread holding some lock which is needed by a
real-time (SCHED_FIFO) thread, if there are two SCHED_FIFO running at
different priorities it's still possible to deadlock the entire
kernel.

Can it be done?  Sure; I was part of an effort to make it work for the
US Navy's DDG-1000 Zumwalt-class destroyer[1].  But it's tricky, and
it's why IBM got paid the big bucks. :-)  Certainly it's going to be
problematic for syzkaller if it's just going to be randomly trying to
set some threads to be real-time without doing any kind of formal
planning.

[1] https://dl.acm.org/doi/10.1147/sj.472.0207

Cheers,

						- Ted