linux-kernel - Re: [syzbot] kernel BUG in __filemap_get

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YmG8zoWKu93EiWb8@casper.infradead.org>
Date:   Thu, 21 Apr 2022 21:21:34 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     syzbot <syzbot+cf4cf13056f85dec2c40@...kaller.appspotmail.com>
Cc:     akpm@...ux-foundation.org, dhowells@...hat.com, hughd@...gle.com,
        kirill.shutemov@...ux.intel.com, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, syzkaller-bugs@...glegroups.com,
        vbabka@...e.cz, william.kucharski@...cle.com
Subject: Re: [syzbot] kernel BUG in __filemap_get_folio

On Wed, Apr 20, 2022 at 08:54:32AM -0700, syzbot wrote:
> syzbot found the following issue on:

The log attached here omits some of the interesting information.
>From the full console log:

> page:ffffea0000b78d00 refcount:2 mapcount:0 mapping:ffff888071347c70 index:0x234 pfn:0x2de34
> memcg:ffff888073230000
> aops:shmem_aops ino:2 dentry name:"cgroup.controllers"
> flags: 0xfff0000008003f(locked|referenced|uptodate|dirty|lru|active|swapbacked|node=0|zone=1|lastcpupid=0x7ff)
> raw: 00fff0000008003f ffffea0000b78cc8 ffffea0000b78d48 ffff888071347c70
> raw: 0000000000000234 0000000000000000 00000002ffffffff ffff888073230000
> page dumped because: VM_BUG_ON_FOLIO(!folio_contains(folio, index))
> page_owner tracks the page as allocated
> page last allocated via order 0, migratetype Movable, gfp_mask 0x13d20ca(GFP_TRANSHUGE_LIGHT|__GFP_NORETRY|__GFP_THISNODE), pid 6314, ts 110712153176, free_ts 109293647371
>  get_page_from_freelist+0xa6f/0x2f10
>  __alloc_pages+0x1b2/0x500
>  alloc_pages_vma+0x545/0x650
>  shmem_alloc_hugepage+0x18c/0x270

This call-site only allocates order-9 pages.  So clearly this was
_allocated_ as an order-9 page and then split.

> ------------[ cut here ]------------
> kernel BUG at mm/filemap.c:1917!
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 6314 Comm: syz-executor.5 Not tainted 5.16.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:__filemap_get_folio+0x72f/0x9c0
> Code: 02 84 c0 74 09 3c 03 7f 05 e8 6d 13 1b 00 41 8b 46 58 48 39 c5 0f 82 68 fc ff ff 48 c7 c6 60 ec d3 88 4c 89 f7 e8 e1 ef 0a 00 <0f> 0b 4d 8d 6e 34 be 04 00 00 00 4c 89 ef e8 ae 16 1b 00 4c 89 e8
> RSP: 0018:ffffc90005ed78e0 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000000182 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff8920c520 RDI: ffff88801191a9ca
> RBP: 0000000000000080 R08: 0000000000000019 R09: ffff8880b9f33fc7
> R10: ffffed10173e67f8 R11: 6f775f6b73617420 R12: dffffc0000000000
> R13: ffffea0000b78d00 R14: ffffea0000b78d00 R15: ffffea0000b78d00
> FS:  00007f0f22c1d700(0000) GS:ffff8880b9f00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f149ec93058 CR3: 00000000705cf000 CR4: 00000000003506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  pagecache_get_page+0x10/0x100

I wish I knew which 'index' we were looking up.  I'll try reproducing it
locally so I can print that out too.

My suspicion is that there's a race where the folio is split during the
lookup, and the bug is really in mapping_get_entry().  The folio->index
is weird though; if this was the explanation, I'd expect it to find a
page at a multiple of 512 or at least a multiple of 64.