linux-kernel - Re: linux-next test error

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180906081253.GB19319@quack2.suse.cz>
Date:   Thu, 6 Sep 2018 10:12:53 +0200
From:   Jan Kara <jack@...e.cz>
To:     Souptick Joarder <jrdr.linux@...il.com>
Cc:     Jan Kara <jack@...e.cz>,
        syzbot+87a05ae4accd500f5242@...kaller.appspotmail.com,
        ak@...ux.intel.com, Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, Linux-MM <linux-mm@...ck.org>,
        mawilcox@...rosoft.com, mgorman@...hsingularity.net,
        syzkaller-bugs@...glegroups.com, tim.c.chen@...ux.intel.com,
        zwisler@...nel.org
Subject: Re: linux-next test error

On Thu 06-09-18 00:37:06, Souptick Joarder wrote:
> On Wed, Sep 5, 2018 at 2:25 PM Jan Kara <jack@...e.cz> wrote:
> >
> > On Wed 05-09-18 00:13:02, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    387ac6229ecf Add linux-next specific files for 20180905
> > > git tree:       linux-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=149c67a6400000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=ad5163873ecfbc32
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=87a05ae4accd500f5242
> > > compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+87a05ae4accd500f5242@...kaller.appspotmail.com
> > >
> > > INFO: task hung in do_page_mkwriteINFO: task syz-fuzzer:4876 blocked for
> > > more than 140 seconds.
> > >       Not tainted 4.19.0-rc2-next-20180905+ #56
> > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > syz-fuzzer      D21704  4876   4871 0x00000000
> > > Call Trace:
> > >  context_switch kernel/sched/core.c:2825 [inline]
> > >  __schedule+0x87c/0x1df0 kernel/sched/core.c:3473
> > >  schedule+0xfb/0x450 kernel/sched/core.c:3517
> > >  io_schedule+0x1c/0x70 kernel/sched/core.c:5140
> > >  wait_on_page_bit_common mm/filemap.c:1100 [inline]
> > >  __lock_page+0x5b7/0x7a0 mm/filemap.c:1273
> > >  lock_page include/linux/pagemap.h:483 [inline]
> > >  do_page_mkwrite+0x429/0x520 mm/memory.c:2391
> >
> > Waiting for page lock after ->page_mkwrite callback. Which means
> > ->page_mkwrite did not return VM_FAULT_LOCKED but 0. Looking into
> > linux-next... indeed "fs: convert return type int to vm_fault_t" has busted
> > block_page_mkwrite(). It has to return VM_FAULT_LOCKED and not 0 now.
> > Souptick, can I ask you to run 'fstests' for at least common filesystems
> > like ext4, xfs, btrfs when you change generic filesystem code please? That
> > would catch a bug like this immediately. Thanks.
> 
> Looking into existing code block_page_mkwrite() returns 0, not VM_FAULT_LOCKED
> in true path and this patch doesn't change any existing behaviour of
> block_page_mkwrite()
> except adding one new input parameter to return err value to caller function.

Yeah, you are right and this confused me. In your version
block_page_mkwrite() returns block_page_mkwrite_return(err1) in case of
error but 0 in case of success and the caller - ext4_page_mkwrite() - then
uses block_page_mkwrite_return() again if block_page_mkwrite() returned 0.
So I agree the code path I pointed out won't result in returning 0 instead
of VM_FAULT_LOCKED but the calling convention is really very confusing.

> -int ext4_page_mkwrite(struct vm_fault *vmf)
> +vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
> 
> +       err = 0;
> +       ret = block_page_mkwrite(vma, vmf, get_block, &err);
>         if (!ret && ext4_should_journal_data(inode)) {
>                 if (ext4_walk_page_buffers(handle, page_buffers(page), 0,
>                           PAGE_SIZE, NULL, do_journal_get_write_access)) {
>                         unlock_page(page);
> -                       ret = VM_FAULT_SIGBUS;
> 
> I think, this part has created problem where page_mkwrite()
> end up with returning 0.

So this branch is definitely wrong but I somewhat doubt it's the one we've
taken - this can happen only in case of IO error.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR