[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <49a741c3-2564-4d6f-b4e1-0402b52a4cb9@linux.ibm.com>
Date: Sat, 22 Mar 2025 19:45:51 +0530
From: Venkat Rao Bagalkote <venkat88@...ux.ibm.com>
To: Qu Wenruo <quwenruo.btrfs@....com>,
"Ritesh Harjani (IBM)" <ritesh.list@...il.com>,
LKML <linux-kernel@...r.kernel.org>, linuxppc-dev@...ts.ozlabs.org,
Madhavan Srinivasan <maddy@...ux.ibm.com>, linux-btrfs@...r.kernel.org
Subject: Re: [linux-next-20250320][btrfs] Kernel OOPs while running btrfs/108
On 22/03/25 2:48 am, Qu Wenruo wrote:
>
>
> 在 2025/3/22 02:26, Ritesh Harjani (IBM) 写道:
>>
>> +linux-btrfs
>>
>> Venkat Rao Bagalkote <venkat88@...ux.ibm.com> writes:
>>
>>> Greetings!!!
>>>
>>>
>>> I am observing Kernel oops while running brtfs/108 TC on IBM Power
>>> System.
>>>
>>> Repo: Linux-Next (next-20250320)
>>
>> Looks like this next tag had many btrfs related changes -
>> https://web.git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/log/fs/btrfs?h=next-20250320
>>
>>
>>>
>>> Traces:
>>>
>>> [ 418.392604] run fstests btrfs/108 at 2025-03-21 05:11:21
>>> [ 418.560137] Kernel attempted to read user page (0) - exploit
>>> attempt?
>>> (uid: 0)
>>> [ 418.560156] BUG: Kernel NULL pointer dereference on read at
>>> 0x00000000
>>
>> NULL pointer dereference...
>>
>>> [ 418.560161] Faulting instruction address: 0xc0000000010ef8b0
>>> [ 418.560166] Oops: Kernel access of bad area, sig: 11 [#1]
>>> [ 418.560169] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=8192 NUMA
>>> pSeries
>>> [ 418.560174] Modules linked in: btrfs blake2b_generic xor raid6_pq
>>> zstd_compress loop nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib
>>> nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct
>>> nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 bonding nf_defrag_ipv4
>>> tls rfkill ip_set nf_tables nfnetlink sunrpc pseries_rng vmx_crypto
>>> fuse
>>> ext4 mbcache jbd2 sd_mod sg ibmvscsi scsi_transport_srp ibmveth
>>> [ 418.560212] CPU: 1 UID: 0 PID: 37583 Comm: rm Kdump: loaded Not
>>> tainted 6.14.0-rc7-next-20250320 #1 VOLUNTARY
>>> [ 418.560218] Hardware name: IBM,9080-HEX Power11
>>> [ 418.560223] NIP: c0000000010ef8b0 LR: c00800000bb190ac CTR:
>>> c0000000010ef888
>>> [ 418.560227] REGS: c0000000a252f5a0 TRAP: 0300 Not tainted
>>> (6.14.0-rc7-next-20250320)
>>> [ 418.560232] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR:
>>> 44008444 XER: 20040000
>>> [ 418.560240] CFAR: c00800000bc1df84 DAR: 0000000000000000 DSISR:
>>> 40000000 IRQMASK: 1
>>> [ 418.560240] GPR00: c00800000bb190ac c0000000a252f840
>>> c0000000016a8100
>>> 0000000000000000
>>> [ 418.560240] GPR04: 0000000000000000 0000000000010000
>>> 0000000000000000
>>> fffffffffffe0000
>>> [ 418.560240] GPR08: c00000010724aad8 0000000000000003
>>> 0000000000001000
>>> c00800000bc1df70
>>> [ 418.560240] GPR12: c0000000010ef888 c000000affffdb00
>>> 0000000000000000
>>> 0000000000000000
>>> [ 418.560240] GPR16: 0000000000000000 0000000000000000
>>> 0000000000000000
>>> 0000000000000000
>>> [ 418.560240] GPR20: c0000000777a8000 c00000006a9c9000
>>> c00000010724a950
>>> c0000000777a8000
>>> [ 418.560240] GPR24: fffffffffffffffe c00000010724aad8
>>> 0000000000010000
>>> 00000000000000a0
>>> [ 418.560240] GPR28: 0000000000010000 c00c00000048c3c0
>>> 0000000000000000
>>> 0000000000000000
>>> [ 418.560287] NIP [c0000000010ef8b0] _raw_spin_lock_irq+0x28/0x98
>>> [ 418.560294] LR [c00800000bb190ac] wait_subpage_spinlock+0x64/0xd0
>>> [btrfs]
>>
>>
>> btrfs is working on subpage size support for a while now.
>> Adding +linux-btrfs, in case if they are already aware of this problem.
>>
>> I am not that familiar with btrfs code. But does this look like that the
>> subpage (folio->private became NULL here) somehow?
>
> The for-next branch seems to have some conflicts, IIRC the following two
> commits are no longer in our tree anymore:
>
> btrfs: kill EXTENT_FOLIO_PRIVATE
> btrfs: add mapping_set_release_always to inode's mapping
>
> I believe those two may be the cause.
>
> Mind to test with the our current for-next branch? Where that's all of
> our development happening, and I run daily subpage fstests on it to make
> sure at least that branch is safe:
>
> https://github.com/btrfs/linux/tree/for-next
>
> And appreciate if you can verify if the NULL pointer dereference is
> still there on that branch.
I verified with the for-next repo, and I dont see the issue. btrfs/108
passes.
./check btrfs/108
RECREATING -- btrfs on /dev/loop0
FSTYP -- btrfs
PLATFORM -- Linux/ppc64le ltcden8-lp1 6.14.0-rc7-g88d324e69ea9 #1
SMP Sat Mar 22 07:47:48 CDT 2025
MKFS_OPTIONS -- -f -s 4096 -n 4096 /dev/loop1
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/loop1
/mnt/scratch
btrfs/108 1s
Ran: btrfs/108
Passed all 1 tests
Repo: https://github.com/btrfs/linux/tree/for-next
Regards,
Venkat.
>
> Thanks,
> Qu
>
>>
>> -ritesh
>>
>>> [ 418.560339] Call Trace:
>>> [ 418.560342] [c0000000a252f870] [c00800000bb205dc]
>>> btrfs_invalidate_folio+0xa8/0x4f0 [btrfs]
>>> [ 418.560384] [c0000000a252f930] [c0000000004cbcdc]
>>> truncate_cleanup_folio+0x110/0x14c
>>> [ 418.560391] [c0000000a252f960] [c0000000004ccc7c]
>>> truncate_inode_pages_range+0x100/0x4dc
>>> [ 418.560397] [c0000000a252fbd0] [c00800000bb20ba8]
>>> btrfs_evict_inode+0x74/0x510 [btrfs]
>>> [ 418.560437] [c0000000a252fc90] [c00000000065c71c] evict+0x164/0x334
>>> [ 418.560443] [c0000000a252fd30] [c000000000647c9c]
>>> do_unlinkat+0x2f4/0x3a4
>>> [ 418.560449] [c0000000a252fde0] [c000000000647da0]
>>> sys_unlinkat+0x54/0xac
>>> [ 418.560454] [c0000000a252fe10] [c000000000033498]
>>> system_call_exception+0x138/0x330
>>> [ 418.560461] [c0000000a252fe50] [c00000000000d05c]
>>> system_call_vectored_common+0x15c/0x2ec
>>> [ 418.560468] --- interrupt: 3000 at 0x7fffb1b366bc
>>> [ 418.560471] NIP: 00007fffb1b366bc LR: 00007fffb1b366bc CTR:
>>> 0000000000000000
>>> [ 418.560475] REGS: c0000000a252fe80 TRAP: 3000 Not tainted
>>> (6.14.0-rc7-next-20250320)
>>> [ 418.560479] MSR: 800000000280f033
>>> <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 44008804 XER: 00000000
>>> [ 418.560490] IRQMASK: 0
>>> [ 418.560490] GPR00: 0000000000000124 00007ffffcb4e2b0
>>> 00007fffb1c37d00
>>> ffffffffffffff9c
>>> [ 418.560490] GPR04: 000000013d660380 0000000000000000
>>> 0000000000000000
>>> 0000000000000003
>>> [ 418.560490] GPR08: 0000000000000000 0000000000000000
>>> 0000000000000000
>>> 0000000000000000
>>> [ 418.560490] GPR12: 0000000000000000 00007fffb1dba5c0
>>> 00007ffffcb4e538
>>> 000000011972d0e8
>>> [ 418.560490] GPR16: 000000011972d098 000000011972d060
>>> 000000011972d020
>>> 000000011972cff0
>>> [ 418.560490] GPR20: 000000011972d298 000000011972cc10
>>> 0000000000000000
>>> 000000013d6615a0
>>> [ 418.560490] GPR24: 0000000000000002 000000011972d0b8
>>> 000000011972cf98
>>> 000000011972d1d0
>>> [ 418.560490] GPR28: 00007ffffcb4e538 000000013d6602f0
>>> 0000000000000000
>>> 0000000000100000
>>> [ 418.560532] NIP [00007fffb1b366bc] 0x7fffb1b366bc
>>> [ 418.560536] LR [00007fffb1b366bc] 0x7fffb1b366bc
>>> [ 418.560538] --- interrupt: 3000
>>> [ 418.560541] Code: 7c0803a6 4e800020 3c4c005c 38428878 7c0802a6
>>> 60000000 39200001 992d0932 a12d0008 3ce0fffe 5529083c 61290001
>>> <7d001829> 7d063879 40c20018 7d063838
>>> [ 418.560555] ---[ end trace 0000000000000000 ]---
>>>
>>>
>>> If you happed to fix this, please add below tag.
>>>
>>>
>>> Reported-by: Venkat Rao Bagalkote <venkat88@...ux.ibm.com>
>>>
>>>
>>> Regards,
>>>
>>> Venkat.
>>
>
>
Powered by blists - more mailing lists