linux-kernel - Re: kernel panic: corrupted stack end in wb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+Y-WW-giKkihkMXkKxQ2mK7Lhc60fCta3TqssiWGM8-2A@mail.gmail.com>
Date:   Mon, 31 Dec 2018 07:31:15 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Qian Cai <cai@....pw>
Cc:     syzbot <syzbot+ec1b7575afef85a0e5ca@...kaller.appspotmail.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>, guro@...com,
        Johannes Weiner <hannes@...xchg.org>,
        Josef Bacik <jbacik@...com>,
        Kirill Tkhai <ktkhai@...tuozzo.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Michal Hocko <mhocko@...e.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Matthew Wilcox <willy@...radead.org>
Subject: Re: kernel panic: corrupted stack end in wb_workfn

On Mon, Dec 31, 2018 at 4:47 AM Qian Cai <cai@....pw> wrote:
>
> Ah, it has KASAN_EXTRA. Need this patch then.
>
> https://lore.kernel.org/lkml/20181228020639.80425-1-cai@lca.pw/
>
> or to use GCC from the HEAD which suppose to reduce the stack-size in half.
>
> shrink_page_list
> shrink_inactive_list
>
> Those things are 7k each, so 32k would be soon gone.

I am not sure it's just KASAN. I reproduced stack overflow at this
stack without KASAN:
https://groups.google.com/forum/#!msg/syzkaller-bugs/ZaBzAJbn6i8/Py9FVlAqDQAJ

Note: this was originally reported 5 months ago:
https://groups.google.com/forum/#!msg/syzkaller-bugs/C7d0Hm6YcDM/nQeciKgtCgAJ
so now at least in 2 releases and causes stream of induced crashes
that people spent time debugging:
https://groups.google.com/forum/#!msg/syzkaller-bugs/ZaBzAJbn6i8/Py9FVlAqDQAJ
https://groups.google.com/forum/#!msg/syzkaller-bugs/GIpnqHiIEQg/5jzwQqqfCwAJ
https://syzkaller.appspot.com/bug?id=26c906d472ea470c2cb58c77f08f964f347cbc68
https://groups.google.com/forum/#!msg/syzkaller-bugs/Ovkbsq5qd84/FHsTYlsfDAAJ
most likely more of these:
https://syzkaller.appspot.com#upstream



> On 12/30/18 10:41 PM, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    195303136f19 Merge tag 'kconfig-v4.21-2' of git://git.kern..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=176c0ebf400000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=5e7dc790609552d7
> > dashboard link: https://syzkaller.appspot.com/bug?extid=ec1b7575afef85a0e5ca
> > compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16a9a84b400000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17199bb3400000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+ec1b7575afef85a0e5ca@...kaller.appspotmail.com
> >
> > Kernel panic - not syncing: corrupted stack end detected inside scheduler
> > CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 4.20.0+ #396
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
> > 01/01/2011
> > Workqueue: writeback wb_workfn (flush-8:0)
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x1d3/0x2c6 lib/dump_stack.c:113
> >  panic+0x2ad/0x55f kernel/panic.c:189
> >  schedule_debug kernel/sched/core.c:3285 [inline]
> >  __schedule+0x1ec6/0x1ed0 kernel/sched/core.c:3394
> >  preempt_schedule_common+0x1f/0xe0 kernel/sched/core.c:3596
> >  preempt_schedule+0x4d/0x60 kernel/sched/core.c:3622
> >  ___preempt_schedule+0x16/0x18
> >  __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline]
> >  _raw_spin_unlock_irqrestore+0xbb/0xd0 kernel/locking/spinlock.c:184
> >  spin_unlock_irqrestore include/linux/spinlock.h:384 [inline]
> >  __remove_mapping+0x932/0x1af0 mm/vmscan.c:967
> >  shrink_page_list+0x6610/0xc2e0 mm/vmscan.c:1461
> >  shrink_inactive_list+0x77b/0x1c60 mm/vmscan.c:1961
> >  shrink_list mm/vmscan.c:2273 [inline]
> >  shrink_node_memcg+0x7a8/0x19a0 mm/vmscan.c:2538
> >  shrink_node+0x3e1/0x17f0 mm/vmscan.c:2753
> >  shrink_zones mm/vmscan.c:2987 [inline]
> >  do_try_to_free_pages+0x3df/0x12a0 mm/vmscan.c:3049
> >  try_to_free_pages+0x4d0/0xb90 mm/vmscan.c:3265
> >  __perform_reclaim mm/page_alloc.c:3920 [inline]
> >  __alloc_pages_direct_reclaim mm/page_alloc.c:3942 [inline]
> >  __alloc_pages_slowpath+0xa5a/0x2db0 mm/page_alloc.c:4335
> >  __alloc_pages_nodemask+0xa89/0xde0 mm/page_alloc.c:4549
> >  alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2106
> >  alloc_pages include/linux/gfp.h:509 [inline]
> >  __page_cache_alloc+0x38c/0x5b0 mm/filemap.c:924
> >  pagecache_get_page+0x396/0xf00 mm/filemap.c:1615
> >  find_or_create_page include/linux/pagemap.h:322 [inline]
> >  ext4_mb_load_buddy_gfp+0xddf/0x1e70 fs/ext4/mballoc.c:1158
> >  ext4_mb_load_buddy fs/ext4/mballoc.c:1241 [inline]
> >  ext4_mb_regular_allocator+0x634/0x1590 fs/ext4/mballoc.c:2190
> >  ext4_mb_new_blocks+0x1de3/0x4840 fs/ext4/mballoc.c:4538
> >  ext4_ext_map_blocks+0x2eef/0x6180 fs/ext4/extents.c:4404
> >  ext4_map_blocks+0x8f7/0x1b60 fs/ext4/inode.c:636
> >  mpage_map_one_extent fs/ext4/inode.c:2480 [inline]
> >  mpage_map_and_submit_extent fs/ext4/inode.c:2533 [inline]
> >  ext4_writepages+0x2564/0x4170 fs/ext4/inode.c:2884
> >  do_writepages+0x9a/0x1a0 mm/page-writeback.c:2335
> >  __writeback_single_inode+0x20a/0x1660 fs/fs-writeback.c:1316
> >  writeback_sb_inodes+0x71f/0x1210 fs/fs-writeback.c:1580
> >  __writeback_inodes_wb+0x1b9/0x340 fs/fs-writeback.c:1649
> >  wb_writeback+0xa73/0xfc0 fs/fs-writeback.c:1758
> > oom_reaper: reaped process 7963 (syz-executor189), now anon-rss:0kB,
> > file-rss:0kB, shmem-rss:0kB
> > rsyslogd invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0,
> > oom_score_adj=0
> >  wb_check_start_all fs/fs-writeback.c:1882 [inline]
> >  wb_do_writeback fs/fs-writeback.c:1908 [inline]
> >  wb_workfn+0xee9/0x1790 fs/fs-writeback.c:1942
> >  process_one_work+0xc90/0x1c40 kernel/workqueue.c:2153
> >  worker_thread+0x17f/0x1390 kernel/workqueue.c:2296
> >  kthread+0x35a/0x440 kernel/kthread.c:246
> >  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
> > CPU: 1 PID: 7840 Comm: rsyslogd Not tainted 4.20.0+ #396
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
> > 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x1d3/0x2c6 lib/dump_stack.c:113
> >  dump_header+0x253/0x1239 mm/oom_kill.c:451
> >  oom_kill_process.cold.27+0x10/0x903 mm/oom_kill.c:966
> >  out_of_memory+0x8ba/0x1480 mm/oom_kill.c:1133
> >  __alloc_pages_may_oom mm/page_alloc.c:3666 [inline]
> >  __alloc_pages_slowpath+0x230c/0x2db0 mm/page_alloc.c:4379
> >  __alloc_pages_nodemask+0xa89/0xde0 mm/page_alloc.c:4549
> >  alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2106
> >  alloc_pages include/linux/gfp.h:509 [inline]
> >  __page_cache_alloc+0x38c/0x5b0 mm/filemap.c:924
> >  page_cache_read mm/filemap.c:2373 [inline]
> >  filemap_fault+0x1595/0x25f0 mm/filemap.c:2557
> >  ext4_filemap_fault+0x82/0xad fs/ext4/inode.c:6317
> >  __do_fault+0x100/0x6b0 mm/memory.c:2997
> >  do_read_fault mm/memory.c:3409 [inline]
> >  do_fault mm/memory.c:3535 [inline]
> >  handle_pte_fault mm/memory.c:3766 [inline]
> >  __handle_mm_fault+0x392f/0x5630 mm/memory.c:3890
> >  handle_mm_fault+0x54f/0xc70 mm/memory.c:3927
> >  do_user_addr_fault arch/x86/mm/fault.c:1475 [inline]
> >  __do_page_fault+0x5f6/0xd70 arch/x86/mm/fault.c:1541
> >  do_page_fault+0xf2/0x7e0 arch/x86/mm/fault.c:1572
> >  page_fault+0x1e/0x30 arch/x86/entry/entry_64.S:1143
> > RIP: 0033:0x7f00f990e1fd
> > Code: Bad RIP value.
> > RSP: 002b:00007f00f6eade30 EFLAGS: 00010293
> > RAX: 0000000000000fd2 RBX: 000000000111f170 RCX: 00007f00f990e1fd
> > RDX: 0000000000000fff RSI: 00007f00f86e25a0 RDI: 0000000000000004
> > RBP: 0000000000000000 R08: 000000000110a260 R09: 0000000000000000
> > R10: 74616c7567657227 R11: 0000000000000293 R12: 000000000065e420
> > R13: 00007f00f6eae9c0 R14: 00007f00f9f53040 R15: 0000000000000003
> > Kernel Offset: disabled
> > Rebooting in 86400 seconds..
> >
> >
> > ---
> > This bug is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@...glegroups.com.
> >
> > syzbot will keep track of this bug report. See:
> > https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with syzbot.
> > syzbot can test patches for this bug, for details see:
> > https://goo.gl/tpsmEJ#testing-patches
> >
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@...glegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/9fe14b68-5a3c-5964-62b1-53a4ef4c0b76%40lca.pw.
> For more options, visit https://groups.google.com/d/optout.