lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACT4Y+Za=DuxCYEqG=AMsrcZ5v=B1dqsCD4bMGq03F9LKGdS0g@mail.gmail.com>
Date:   Mon, 4 Nov 2019 09:04:05 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Song Liu <songliubraving@...com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        syzbot <syzbot+efb9e48b9fbdc49bb34a@...kaller.appspotmail.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "amir73il@...il.com" <amir73il@...il.com>,
        "darrick.wong@...cle.com" <darrick.wong@...cle.com>,
        "hughd@...gle.com" <hughd@...gle.com>,
        "jack@...e.cz" <jack@...e.cz>,
        "jglisse@...hat.com" <jglisse@...hat.com>,
        Josef Bacik <josef@...icpanda.com>,
        "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "sfr@...b.auug.org.au" <sfr@...b.auug.org.au>,
        "syzkaller-bugs@...glegroups.com" <syzkaller-bugs@...glegroups.com>,
        "william.kucharski@...cle.com" <william.kucharski@...cle.com>,
        "willy@...radead.org" <willy@...radead.org>
Subject: Re: INFO: task hung in mpage_prepare_extent_to_map

On Mon, Oct 28, 2019 at 11:16 PM Song Liu <songliubraving@...com> wrote:
>
>
>
> > On Oct 28, 2019, at 1:14 PM, Johannes Weiner <hannes@...xchg.org> wrote:
> >
> > On Mon, Oct 28, 2019 at 12:52:09PM -0700, syzbot wrote:
> >> Hello,
> >>
> >> syzbot found the following crash on:
> >>
> >> HEAD commit:    12d61c69 Add linux-next specific files for 20191024
> >> git tree:       linux-next
> >> console output: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_log.txt-3Fx-3D15a0fa97600000&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=dR8692q0_uaizy0jkrBJQM5k2hfm4CiFxYT8KaysFrg&m=YEaOe5RP2hLXAC4tKPLehAQsea0_3k3tI4DL32BcA-8&s=6-TXLGQxJcK1GdwMwa51423Y221rRncNiC_T09O0OLc&e=
> >> kernel config:  https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_.config-3Fx-3Dafb75fd8c9fd5ed8&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=dR8692q0_uaizy0jkrBJQM5k2hfm4CiFxYT8KaysFrg&m=YEaOe5RP2hLXAC4tKPLehAQsea0_3k3tI4DL32BcA-8&s=GuFgLJZOb7jtjZ5mDbkVT_zqtiVW4Py13e6Oq5CFxgY&e=
> >> dashboard link: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_bug-3Fextid-3Defb9e48b9fbdc49bb34a&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=dR8692q0_uaizy0jkrBJQM5k2hfm4CiFxYT8KaysFrg&m=YEaOe5RP2hLXAC4tKPLehAQsea0_3k3tI4DL32BcA-8&s=pF1hv-zGR8F378weGq9zxCE5ibI2_73qweMB_KuaZLM&e=
> >> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >> syz repro:      https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_repro.syz-3Fx-3D13a63dc4e00000&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=dR8692q0_uaizy0jkrBJQM5k2hfm4CiFxYT8KaysFrg&m=YEaOe5RP2hLXAC4tKPLehAQsea0_3k3tI4DL32BcA-8&s=mI7ZOgrDWeG-p6vn2d_kj65a5g8J7exXJ2MIUUF84-w&e=
> >>
> >> The bug was bisected to:
> >>
> >> commit 9c61acffe2b8833152041f7b6a02d1d0a17fd378
> >> Author: Song Liu <songliubraving@...com>
> >> Date:   Wed Oct 23 00:24:28 2019 +0000
> >>
> >>    mm,thp: recheck each page before collapsing file THP
> >>
> >> bisection log:  https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_bisect.txt-3Fx-3D13eb6ec0e00000&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=dR8692q0_uaizy0jkrBJQM5k2hfm4CiFxYT8KaysFrg&m=YEaOe5RP2hLXAC4tKPLehAQsea0_3k3tI4DL32BcA-8&s=YtSUy5Dtjo6tek7CvwzMTPL40BJwOC6rEom-AkVx0SM&e=
> >> final crash:    https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_report.txt-3Fx-3D101b6ec0e00000&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=dR8692q0_uaizy0jkrBJQM5k2hfm4CiFxYT8KaysFrg&m=YEaOe5RP2hLXAC4tKPLehAQsea0_3k3tI4DL32BcA-8&s=BvPJx3QSPHgsN12jSZci_MqW_VxYp-MZpQtogZjlJOo&e=
> >> console output: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_log.txt-3Fx-3D17eb6ec0e00000&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=dR8692q0_uaizy0jkrBJQM5k2hfm4CiFxYT8KaysFrg&m=YEaOe5RP2hLXAC4tKPLehAQsea0_3k3tI4DL32BcA-8&s=YPvxWpQDpk9MI9W6QCtxME64wmxL2CZ5ZtEkCn0nI0c&e=
> >>
> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> Reported-by: syzbot+efb9e48b9fbdc49bb34a@...kaller.appspotmail.com
> >> Fixes: 9c61acffe2b8 ("mm,thp: recheck each page before collapsing file THP")
> >>
> >> INFO: task khugepaged:1084 blocked for more than 143 seconds.
> >>      Not tainted 5.4.0-rc4-next-20191024 #0
> >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> khugepaged      D27568  1084      2 0x80004000
> >> Call Trace:
> >> context_switch kernel/sched/core.c:3384 [inline]
> >> __schedule+0x94a/0x1e70 kernel/sched/core.c:4069
> >> schedule+0xd9/0x260 kernel/sched/core.c:4136
> >> io_schedule+0x1c/0x70 kernel/sched/core.c:5780
> >> wait_on_page_bit_common mm/filemap.c:1175 [inline]
> >> __lock_page+0x422/0xab0 mm/filemap.c:1383
> >> lock_page include/linux/pagemap.h:480 [inline]
> >> mpage_prepare_extent_to_map+0xb3f/0xf90 fs/ext4/inode.c:2668
> >> ext4_writepages+0xb6a/0x2e70 fs/ext4/inode.c:2866
> >> ? 0xffffffff81000000
> >> do_writepages+0xfa/0x2a0 mm/page-writeback.c:2344
> >> __filemap_fdatawrite_range+0x2bc/0x3b0 mm/filemap.c:421
> >> __filemap_fdatawrite mm/filemap.c:429 [inline]
> >> filemap_flush+0x24/0x30 mm/filemap.c:456
> >
> > This is a double locking deadlock. The page lock is already held when
> > we call into filemap_flush() here, and does another lock_page() in
> > write_cache_pages().
> >
> > To fix it, we have to either initiate flushing before acquiring the
> > page lock, or simply skip over dirty pages.
> >
> > Maybe doing vfs_fsync_range() from the madvise(HUGEPAGE) call isn't a
> > bad idea after all? (I had discussed this with Song off-list before.)
>
> Thanks syzbot and Johannes!
>
> I just sent a quick fix, that just removes filemap_flush().
>
> I will work on a better mechanism to flush the file.

Is this expected to reach linux-next soon?
It's still not there and in the past days this crash happened 17K+
times and effectively stalled linux-next testing:
https://syzkaller.appspot.com/bug?id=4a3b0ba28ec7d0277338be02e1331068504dc228

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ