linux-kernel - Re: bio linked list corruption.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOMGZ=Eb8mw6CfGaruM-tRir7pWn78vLTaCqp1pGPqkagfzPPA@mail.gmail.com>
Date:   Mon, 5 Dec 2016 20:11:16 +0100
From:   Vegard Nossum <vegard.nossum@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Dave Jones <davej@...emonkey.org.uk>, Chris Mason <clm@...com>,
        Jens Axboe <axboe@...com>,
        Andy Lutomirski <luto@...capital.net>,
        Andy Lutomirski <luto@...nel.org>,
        Al Viro <viro@...iv.linux.org.uk>, Josef Bacik <jbacik@...com>,
        David Sterba <dsterba@...e.com>,
        linux-btrfs <linux-btrfs@...r.kernel.org>,
        Linux Kernel <linux-kernel@...r.kernel.org>,
        Dave Chinner <david@...morbit.com>
Subject: Re: bio linked list corruption.

On 5 December 2016 at 18:55, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Mon, Dec 5, 2016 at 9:09 AM, Vegard Nossum <vegard.nossum@...il.com> wrote:
>>
>> The warning shows that it made it past the list_empty_careful() check
>> in finish_wait() but then bugs out on the &wait->task_list
>> dereference.
>>
>> Anything stick out?
>
> I hate that shmem waitqueue garbage. It's really subtle.
>
> I think the problem is that "wake_up_all()" in shmem_fallocate()
> doesn't necessarily wake up everything. It wakes up TASK_NORMAL -
> which does include TASK_UNINTERRUPTIBLE, but doesn't actually mean
> "everything on the list".
>
> I think that what happens is that the waiters somehow move from
> TASK_UNINTERRUPTIBLE to TASK_RUNNING early, and this means that
> wake_up_all() will ignore them, leave them on the list, and now that
> list on stack is no longer empty at the end.
>
> And the way *THAT* can happen is that the task is on some *other*
> waitqueue as well, and that other waiqueue wakes it up. That's not
> impossible, you can certainly have people on wait-queues that still
> take faults.
>
> Or somebody just uses a directed wake_up_process() or something.
>
> Since you apparently can recreate this fairly easily, how about trying
> this stupid patch?
>
> NOTE! This is entirely untested. I may have screwed this up entirely.
> You get the idea, though - just remove the wait queue head from the
> list - the list entries stay around, but nothing points to the stack
> entry (that we're going to free) any more.
>
> And add the warning to see if this actually ever triggers (and because
> I'd like to see the callchain when it does, to see if it's another
> waitqueue somewhere or what..)

------------[ cut here ]------------
WARNING: CPU: 22 PID: 14012 at mm/shmem.c:2668 shmem_fallocate+0x9a7/0xac0
Kernel panic - not syncing: panic_on_warn set ...

CPU: 22 PID: 14012 Comm: trinity-c73 Not tainted 4.9.0-rc7+ #220
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
ffff8801e32af970 ffffffff81fb08c1 ffffffff83e74b60 ffff8801e32afa48
ffffffff83ed7600 ffffffff847103e0 ffff8801e32afa38 ffffffff81515244
0000000041b58ab3 ffffffff844e21da ffffffff81515061 ffffffff8151591e
Call Trace:
[<ffffffff81fb08c1>] dump_stack+0x83/0xb2
[<ffffffff81515244>] panic+0x1e3/0x3ad
[<ffffffff812708bf>] __warn+0x1bf/0x1e0
[<ffffffff81270aac>] warn_slowpath_null+0x2c/0x40
[<ffffffff8157aef7>] shmem_fallocate+0x9a7/0xac0
[<ffffffff8167c6c0>] vfs_fallocate+0x350/0x620
[<ffffffff815ee5c2>] SyS_madvise+0x432/0x1290
[<ffffffff8100524f>] do_syscall_64+0x1af/0x4d0
[<ffffffff83c965b4>] entry_SYSCALL64_slow_path+0x25/0x25
------------[ cut here ]------------

Attached a full log.


Vegard

Download attachment "0.txt.gz" of type "application/x-gzip" (18919 bytes)