lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <989d5e03-d18e-4fbf-8b55-a847a928c8fd@redhat.com>
Date: Thu, 21 Mar 2024 10:58:01 +0100
From: David Hildenbrand <david@...hat.com>
To: Muchun Song <muchun.song@...ux.dev>,
 syzbot <syzbot+3b9148f91b7869120e81@...kaller.appspotmail.com>,
 Oscar Salvador <osalvador@...e.de>, Matthew Wilcox <willy@...radead.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
 LKML <linux-kernel@...r.kernel.org>, Linux-MM <linux-mm@...ck.org>,
 syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [mm?] kernel BUG in const_folio_flags

On 21.03.24 10:49, Muchun Song wrote:
> 
> 
>> On Mar 21, 2024, at 12:04, syzbot <syzbot+3b9148f91b7869120e81@...kaller.appspotmail.com> wrote:
>>
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit:    78c3925c048c Merge tag 'soc-late-6.9' of git://git.kernel...
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1267d879180000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=f3c2635ded15fbc9
>> dashboard link: https://syzkaller.appspot.com/bug?extid=3b9148f91b7869120e81
>> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>> userspace arch: i386
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> Downloadable assets:
>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-78c3925c.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/cf2bceeccde3/vmlinux-78c3925c.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/fc938dfaea6d/bzImage-78c3925c.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+3b9148f91b7869120e81@...kaller.appspotmail.com
>>
>> veth_newlink+0x627/0xa10 drivers/net/veth.c:1895
>> rtnl_newlink_create net/core/rtnetlink.c:3494 [inline]
>> __rtnl_newlink+0x119c/0x1960 net/core/rtnetlink.c:3714
>> rtnl_newlink+0x67/0xa0 net/core/rtnetlink.c:3727
>> rtnetlink_rcv_msg+0x3c7/0xe60 net/core/rtnetlink.c:6595
>> ------------[ cut here ]------------
>> kernel BUG at include/linux/page-flags.h:315!
> 
> There are some more page dumping information from console:
> 
> [ 61.367144][ T42] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff888028132880 pfn:0x28130
> [ 61.371430][ T42] flags: 0xfff80000000000(node=0|zone=1|lastcpupid=0xfff)
> [ 61.374455][ T42] page_type: 0xffffffff()
> [ 61.376096][ T42] raw: 00fff80000000000 ffff888015ecd540 dead000000000100 0000000000000000
> [ 61.379994][ T42] raw: ffff888028132880 0000000000190000 00000000ffffffff 0000000000000000
> 
> Alright, the page is freed (with a refcount of 0).
> 
>> invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
>> CPU: 1 PID: 42 Comm: kcompactd0 Not tainted 6.8.0-syzkaller-11725-g78c3925c048c #0
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
>> RIP: 0010:const_folio_flags+0x1bd/0x1f0 include/linux/page-flags.h:315
> 
> The RIP is in const_folio_flags() (called from folio_test_hugetlb()):
> 
> 	VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
> 
> It is reasonable to WARN because the page is freed (PG_head is not set
> in this case).
> 
> The comments from folio_test_hugetlb() says "Caller should have a
> reference on the folio", so the caller of PageHuge() should grab
> a refcount before calling folio_test_hugetlb() since commit
> 9c5ccf2db04b. But it does not mean that the @page must be a HugeTLB page
> even if PageHuge(@page) returns true when the user does not hold
> a extra refcount on the @page. Seems the WARN could be acceptable, so
> should we remove this WARN? I am not sure. Cc more experts.

Isn't this the problem Willy is fixing with the upcoing 
folio_test_hugetlb() changes?

We cannot always grab a folio reference on hugetlb folios: free hugetlb 
folios have a refcount of 0.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ