linux-ext4 - Re: kernel BUG at fs/ext4/mballoc.c:1911!

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANQeFDDFSPfmYcVuNme8BhQt+78Fo0GyUmG8udyGdPfzmxAjnQ@mail.gmail.com>
Date:   Mon, 9 Apr 2018 10:25:46 -0700
From:   Liu Bo <obuil.liubo@...il.com>
To:     Liu Bo <bo.liu@...ux.alibaba.com>, tytso@....edu,
        Jan Kara <jack@...e.cz>
Cc:     linux-ext4@...r.kernel.org
Subject: Re: kernel BUG at fs/ext4/mballoc.c:1911!

Hi,

Ping?

>From the code's perspective, it seems like that this situation could
only happen when some memory corruption happened, but I'm not 100%
sure.

Ted or Jan, could you please take a look at this?

thanks,
liubo


On Thu, Mar 29, 2018 at 3:15 PM, Liu Bo <bo.liu@...ux.alibaba.com> wrote:
> Hi,
>
> I got the following crash from ext4, although I got a partially dumped
> vmcore, I'm not able to find out the root cause, any ideas are
> appreciated.
>
>
> The kernel is 4.9.0.
>
> [943102.751128] EXT4-fs error (device nvme0n1p1): ext4_wait_block_bitmap:503: comm fsabc: Cannot read block bitmap - block_group = 8383, block_bitmap = 274202639
> [943102.751131] EXT4-fs (nvme0n1p1): previous I/O error to superblock detected
> [943102.751134] Buffer I/O error on dev nvme0n1p1, logical block 0, lost sync page write
> [943102.751298] ------------[ cut here ]------------
> [943102.751299] kernel BUG at fs/ext4/mballoc.c:1911!
> [943102.751300] invalid opcode: 0000 [#1] SMP
> ...
> [943102.751316] task: ffff887be5fabe00 task.stack: ffffc90037994000
> [943102.751339] RIP: 0010:[<ffffffffa05336dc>]  [<ffffffffa05336dc>] ext4_mb_simple_scan_group+0x14c/0x160 [ext4]
> [943102.751340] RSP: 0018:ffffc90037997820  EFLAGS: 00010246
> [943102.751340] RAX: 0000000000000008 RBX: 000000000000000c RCX: 0000000000000008
> [943102.751341] RDX: 0000000000000040 RSI: 0000000000000038 RDI: ffff88634047aff8
> [943102.751342] RBP: ffffc90037997860 R08: ffff000000000000 R09: 000015f80607a517
> [943102.751342] R10: ffff887e3f89c000 R11: ffff886ff7a54928 R12: ffff887d002cde80
> [943102.751343] R13: ffff887e3cabc000 R14: ffffc90037997898 R15: 0000000000000030
> [943102.751344] FS:  00007f1da6eff700(0000) GS:ffff887e7e680000(0000) knlGS:0000000000000000
> [943102.751345] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [943102.751345] CR2: 00007f1d78008708 CR3: 0000007be3012000 CR4: 00000000007606f0
> [943102.751346] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [943102.751347] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [943102.751347] PKRU: 55555554
> [943102.751347] Stack:
> [943102.751350]  000000080000417f 0000000800000001 6a0d0bf01fb48b22 ffff887d002cde80
> [943102.751351]  00000000000020bf 0000000000000000 0000000000000000 ffff887e3cabc000
> [943102.751352]  ffffc90037997900 ffffffffa05340a6 00000000379978a0 ffff887e3cabd000
> [943102.751353] Call Trace:
> [943102.751363]  [<ffffffffa05340a6>] ext4_mb_regular_allocator+0x356/0x460 [ext4]
> [943102.751371]  [<ffffffffa0535b9c>] ext4_mb_new_blocks+0x5ec/0xaf0 [ext4]
> [943102.751379]  [<ffffffffa052434d>] ? __read_extent_tree_block+0x5d/0x1f0 [ext4]
> [943102.751386]  [<ffffffffa0525633>] ? ext4_find_extent+0x143/0x2d0 [ext4]
> [943102.751394]  [<ffffffffa052a7de>] ext4_ext_map_blocks+0xb5e/0xf30 [ext4]
> [943102.751397]  [<ffffffff811bb75c>] ? node_dirty_ok+0x12c/0x170
> [943102.751403]  [<ffffffffa04f7802>] ext4_map_blocks+0x172/0x600 [ext4]
> [943102.751406]  [<ffffffff8127a8c1>] ? alloc_buffer_head+0x21/0x60
> [943102.751407]  [<ffffffff81233601>] ? mem_cgroup_commit_charge+0x91/0x530
> [943102.751413]  [<ffffffffa04f7d22>] _ext4_get_block+0x92/0x100 [ext4]
> [943102.751419]  [<ffffffffa04f7da6>] ext4_get_block+0x16/0x20 [ext4]
> [943102.751420]  [<ffffffff8127d357>] __block_write_begin_int+0x197/0x5e0
> [943102.751425]  [<ffffffffa04f7d90>] ? _ext4_get_block+0x100/0x100 [ext4]
> [943102.751432]  [<ffffffffa04fcb56>] ? ext4_write_begin+0x126/0x5b0 [ext4]
> [943102.751433]  [<ffffffff8127d7b1>] __block_write_begin+0x11/0x20
> [943102.751439]  [<ffffffffa04fcbdc>] ext4_write_begin+0x1ac/0x5b0 [ext4]
> [943102.751446]  [<ffffffffa052cfdd>] ? __ext4_journal_stop+0x3d/0xa0 [ext4]
> [943102.751449]  [<ffffffff811ab578>] generic_perform_write+0xc8/0x1c0
> [943102.751451]  [<ffffffff8125f37e>] ? file_update_time+0x5e/0x110
> [943102.751452]  [<ffffffff811adbb5>] __generic_file_write_iter+0x185/0x1d0
> [943102.751458]  [<ffffffffa04f1a6b>] ext4_file_write_iter+0x8b/0x380 [ext4]
> [943102.751460]  [<ffffffff81247429>] ? vfs_getattr_nosec+0x29/0x40
> [943102.751462]  [<ffffffff81247c6f>] ? cp_new_stat+0x14f/0x180
> [943102.751463]  [<ffffffff81241115>] __vfs_write+0xe5/0x160
> [943102.751464]  [<ffffffff812423b5>] vfs_write+0xb5/0x1a0
> [943102.751465]  [<ffffffff81243875>] SyS_write+0x55/0xc0
> [943102.751468]  [<ffffffff8171a6da>] entry_SYSCALL_64_fastpath+0x1a/0xc5
> [943102.751476] Code: 39 44 24 3c 75 27 49 8b 85 60 <0f> 0b 0f 0b e8 3b 57 b5 e0 90 66
> [943102.751484] RIP  [<ffffffffa05336dc>] ext4_mb_simple_scan_group+0x14c/0x160 [ext4]
> [943102.751484]  RSP <ffffc90037997820>
>
>
> Here is my diagnosis so far,
>
> The code of interest is
> ---------------------------
> static noinline_for_stack
> void ext4_mb_simple_scan_group(struct ext4_allocation_context *ac,
>                                         struct ext4_buddy *e4b)
> {
>         struct super_block *sb = ac->ac_sb;
>         struct ext4_group_info *grp = e4b->bd_info;
>         void *buddy;
>         int i;
>         int k;
>         int max;
>
>         BUG_ON(ac->ac_2order <= 0);
>         for (i = ac->ac_2order; i <= sb->s_blocksize_bits + 1; i++) {
>                 if (grp->bb_counters[i] == 0)
>                         continue;
>
>                 buddy = mb_find_buddy(e4b, i, &max);
>                 BUG_ON(buddy == NULL);
>
>                 k = mb_find_next_zero_bit(buddy, max, 0);
>                 BUG_ON(k >= max);
>
>                 ac->ac_found++;
>
>                 ac->ac_b_ex.fe_len = 1 << i;
>                 ac->ac_b_ex.fe_start = k << i;
>                 ac->ac_b_ex.fe_group = e4b->bd_group;
> -------------------------------------
> (a) ac->ac_2order = 11 as
> crash> grep ac_2order ac.txt
>   ac_2order = 11 '\v',
>
> (b) ac_b_ex.fe_start = 0, so order = 11 has been skipped in continue;
> crash> grep ac_b_ex -A 5 ac.txt
>   ac_b_ex = {
>     fe_logical = 2048,
>     fe_start = 0,
>     fe_group = 0,
>     fe_len = 0
>   },
>
> (c) grp->bb_counters[11] = 0, so we head to i = 12
>
> (d)
> Inside the buddy bitmap in memory, 'order=12' has 1 free block, i.e. bb_counters[12] = 1.
> -----------
> crash> grep bb_counter grp.txt
>   bb_counters = 0xffff887e101a9648
> crash> rd 0xffff887e101a9648 -32 16
> ffff887e101a9648:  00000013 fffffe21 ffffff33 ffffffbb   ....!...3.......
> ffff887e101a9658:  fffffffa fffffff8 00000007 00000002   ................
> ffff887e101a9668:  0000000a 00000005 00000007 00000000   ................
> ffff887e101a9678:  00000001 00000001 00000000 00000000   ................
> -----------
>
> But...
> k = mb_find_next_zero_bit(buddy, max, 0);
> BUG_ON(k >= max)   --> kernel BUG at fs/ext4/mballoc.c:1911!
>
> k = max = 8,
> the buddy bitmap shows every bit is 1, i.e. no free block is available.
>
> (e) there're errors about reading this bitmap(group 8383) shown in the log,
> crash> grep group e4b.txt
>   bd_group = 8383
>
>  however when it comes to BUG_ON(k >= max), reading this bitmap has
>  been successful, and it is the inconsistence between ->bb_counters
>  and the buddy bitmap that ends up with the crash, but if the buddy
>  bitmap was regenerated, bb_counters should match with the buddy
>  bitmap.
>
> Anyway, can anyone take a look at it, please?
>
> thanks,
> -liubo