linux-kernel - Re: Fwd: kernel bug when performing heavy IO operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b290c417-de1b-4af8-9f5e-133abb79580d@gmail.com>
Date:   Wed, 27 Sep 2023 13:36:52 +0800
From:   dianlujitao@...il.com
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Bagas Sanjaya <bagasdotme@...il.com>, Chris Mason <clm@...com>,
        Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux btrfs <linux-btrfs@...r.kernel.org>,
        Linux Filesystem Development <linux-fsdevel@...r.kernel.org>
Subject: Re: Fwd: kernel bug when performing heavy IO operations

Hello, I got some logs with 6.5.4 kernel from the official linux package 
of Arch, no zen patches this time. Full dmesg is uploaded to 
https://fars.ee/F1yM and below is a small snippet for your convenience, 
from which PG_offline is no longer set:

[177850.039441] BUG: Bad page map in process ld.lld pte:8000000edacc4025 
pmd:147f96067
[177850.039454] page:000000007415dd6c refcount:22 mapcount:-237 
mapping:00000000b0c37ca6 index:0x1075 pfn:0xedacc4
[177850.039460] memcg:ffff9289345d4000
[177850.039463] aops:btrfs_aops [btrfs] ino:fb2b838 dentry name:"lld"
[177850.039592] flags: 
0xaffff9800002056(referenced|uptodate|lru|workingset|private|node=1|zone=2|lastcpupid=0xffff)
[177850.039597] page_type: 0xffffff12(buddy|0x6d)
[177850.039602] raw: 0affff9800002056 ffffe7623b6b3148 ffffe7623b6b3088 
ffff928202a9fc10
[177850.039605] raw: 0000000000001075 0000000000000001 00000016ffffff12 
ffff9289345d4000
[177850.039607] page dumped because: bad pte
[177850.039608] addr:0000000001275000 vm_flags:08000071 
anon_vma:0000000000000000 mapping:ffff928202a9fc10 index:1075
[177850.039612] file:lld fault:filemap_fault mmap:btrfs_file_mmap 
[btrfs] read_folio:btrfs_read_folio [btrfs]
[177850.039846] CPU: 40 PID: 2060138 Comm: ld.lld Tainted: G           
OE      6.5.4-arch2-1 #1 a30a3b4701899b64bf6025fd97642e50bf2dcad4
[177850.039851] Hardware name: JGINYUE X99-8D3/2.5G Server/X99-8D3/2.5G 
Server, BIOS 5.11 06/30/2022
[177850.039853] Call Trace:
[177850.039857]  <TASK>
[177850.039864]  dump_stack_lvl+0x47/0x60
[177850.039871]  print_bad_pte+0x1bc/0x280
[177850.039879]  ? page_remove_rmap+0x8d/0x260
[177850.039885]  unmap_page_range+0xa96/0x1150
[177850.039894]  unmap_vmas+0xf8/0x190
[177850.039900]  exit_mmap+0xe4/0x310
[177850.039909]  __mmput+0x3e/0x130
[177850.039916]  do_exit+0x31c/0xb20
[177850.039920]  ? futex_wait_queue+0x63/0x90
[177850.039927]  do_group_exit+0x31/0x80
[177850.039932]  get_signal+0x9a5/0x9e0
[177850.039941]  arch_do_signal_or_restart+0x3e/0x270
[177850.039947]  exit_to_user_mode_prepare+0x185/0x1e0
[177850.039955]  syscall_exit_to_user_mode+0x1b/0x40
[177850.039962]  do_syscall_64+0x6c/0x90
[177850.039969]  ? do_futex+0x128/0x190
[177850.039973]  ? __x64_sys_futex+0x129/0x1e0
[177850.039977]  ? switch_fpu_return+0x50/0xe0
[177850.039986]  ? syscall_exit_to_user_mode+0x2b/0x40
[177850.039991]  ? do_syscall_64+0x6c/0x90
[177850.039996]  ? exc_page_fault+0x7f/0x180
[177850.040002]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[177850.040009] RIP: 0033:0x7f1851e164ae
[177850.040056] Code: Unable to access opcode bytes at 0x7f1851e16484.
[177850.040058] RSP: 002b:00007f18227fbd30 EFLAGS: 00000246 ORIG_RAX: 
00000000000000ca
[177850.040063] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
00007f1851e164ae
[177850.040066] RDX: 0000000000000000 RSI: 0000000000000189 RDI: 
0000000005818134
[177850.040068] RBP: 0000000000000000 R08: 0000000000000000 R09: 
00000000ffffffff
[177850.040070] R10: 0000000000000000 R11: 0000000000000246 R12: 
0000000000000000
[177850.040072] R13: 00000000058180e0 R14: 0000000000000000 R15: 
0000000005818134
[177850.040078]  </TASK>
[177850.040112] Disabling lock debugging due to kernel taint

在 2023/8/27 19:54, Matthew Wilcox 写道:
> On Sun, Aug 27, 2023 at 12:34:54PM +0800, dianlujitao wrote:
>> 在 2023/8/27 11:45, Matthew Wilcox 写道:
>>> On Sun, Aug 27, 2023 at 10:20:51AM +0700, Bagas Sanjaya wrote:
>>>>> When the IO load is heavy (compiling AOSP in my case), there's a chance to crash the kernel, the only way to recover is to perform a hard reset. Logs look like follows:
>>>>>
>>>>> 8月 25 13:52:23 arch-pc kernel: BUG: Bad page map in process tmux: client  pte:8000000462500025 pmd:b99c98067
>>>>> 8月 25 13:52:23 arch-pc kernel: page:00000000460fa108 refcount:4 mapcount:-256 mapping:00000000612a1864 index:0x16 pfn:0x462500
>>>>> 8月 25 13:52:23 arch-pc kernel: memcg:ffff8a1056ed0000
>>>>> 8月 25 13:52:23 arch-pc kernel: aops:btrfs_aops [btrfs] ino:9c4635 dentry name:"locale-archive"
>>>>> 8月 25 13:52:23 arch-pc kernel: flags: 0x2ffff5800002056(referenced|uptodate|lru|workingset|private|node=0|zone=2|lastcpupid=0xffff)
>>>>> 8月 25 13:52:23 arch-pc kernel: page_type: 0xfffffeff(offline)
>>> This is interesting.  PG_offline is set.
>>>
>>> $ git grep SetPageOffline
>>> arch/powerpc/platforms/powernv/memtrace.c:              __SetPageOffline(pfn_to_page(pfn));
>>> drivers/hv/hv_balloon.c:                        __SetPageOffline(pg);
>>> drivers/hv/hv_balloon.c:                        __SetPageOffline(pg + j);
>>> drivers/misc/vmw_balloon.c:             __SetPageOffline(page + i);
>>> drivers/virtio/virtio_mem.c:            __SetPageOffline(page);
>>> drivers/xen/balloon.c:  __SetPageOffline(page);
>>> include/linux/balloon_compaction.h:     __SetPageOffline(page);
>>> include/linux/balloon_compaction.h:     __SetPageOffline(page);
>>>
>>> But there's no indication that this kernel is running under a
>>> hypervisor:
>>>
>>>>> 8月 25 13:52:23 arch-pc kernel: Hardware name: JGINYUE X99-8D3/2.5G Server/X99-8D3/2.5G Server, BIOS 5.11 06/30/2022
>> Yes, I'm running on bare metal hardware.
>>> So I'd agree with Artem, this looks like bad RAM.
>>>
>> I ran memtest86+ 6.20 for a cycle and it passed. However, could an OOM
>> trigger the bug? e.g., kernel bug fired before the OOM killer has a
>> chance to start? Just a guess because the last log entry in journalctl
>> before "BUG" is an hour earlier.
> The problem is that OOM doesn't SetPageOffline.  The only things that
> do are hypervisor guest drivers.  So we've got a random bit being
> cleared, and either that's a stray write which happens to land in
> the struct page in question, or it's bad hardware.  Since it's a
> single bit that's being cleared, bad hardware is the most likely
> explanation, but it's not impossible for there to be a bug that's
> doing this.  The problem is that it could be almost anything ...