linux-ext4 - Re: INFO: rcu detected stall in ext4_file_write

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACT4Y+Z=7+UZk9cxOaGC9B-=U=YsKj9AWyOYZKBGpZZSPdU9mg@mail.gmail.com>
Date:   Thu, 28 Feb 2019 10:34:29 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     "Theodore Y. Ts'o" <tytso@....edu>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        syzbot <syzbot+7d19c5fe6a3f1161abb7@...kaller.appspotmail.com>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        linux-ext4@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>
Subject: Re: INFO: rcu detected stall in ext4_file_write_iter

On Wed, Feb 27, 2019 at 10:58 PM Theodore Y. Ts'o <tytso@....edu> wrote:
>
> On Wed, Feb 27, 2019 at 10:58:50AM +0100, Dmitry Vyukov wrote:
> > Peter, Ingo, do you have any updates on the
> > perf_event_open/sched_setattr stalls? This bug cause assorted hangs
> > throughout kernel and so is nasty.
> >
> > syzkaller tries to remove all syscalls from reproducers one-by-one.
> > Somehow without sched_setattr the hang did not reproduce (a bunch of
> > repros have perf_event_open+sched_setattr so somehow they seem to be
> > related)
>
> FWIW, at least for me, the repro.c with sched_setattr commented out
> (see the repro.c attached to a message[1] earlier in the thread) it
> was reproducing reliably on a 2 CPU, 2 GB memory KVM using the
> ext4.git tree (dev branch, 5.0-rc3 plus ext4 commits for the next
> merge window) using a Debian stable-based VM[2].
>
> [1] https://groups.google.com/d/msg/syzkaller-bugs/ByPpM3WZw1s/li7SsaEyAgAJ
> [2] https://mirrors.edge.kernel.org/pub/linux/kernel/people/tytso/kvm-xfstests/root_fs.img.amd64
>
> > But even with perfect repros machines still won't be
> > able to tell in all cases that even though the hang happened in ext4
> > code, the root cause is actually another scheduler-related system
> > call. So thanks for looking into this.
>
> To be clear, there was *not* a scheduler-related system call in the
> repro.c I was playing with (see [2]); just perf_event_open(2) and
> sendfile(2).

Let me correct the statement then:

But even with perfect repros machines still won't be able to tell in
all cases that even though the hang happened in ext4 code, the root
cause is actually another perf-related system call. So thanks for
looking into this.