[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240716062030.donbv4a6oytsco44@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com>
Date: Tue, 16 Jul 2024 14:20:30 +0800
From: Zorro Lang <zlang@...hat.com>
To: Theodore Ts'o <tytso@....edu>
Cc: linux-ext4@...r.kernel.org, fstests@...r.kernel.org,
"Darrick J. Wong" <djwong@...nel.org>,
Daniel Gomez <da.gomez@...sung.com>
Subject: Re: [Bug report]: fstests g/388 crash on ext4, BUG: kernel NULL
pointer dereference, address: 0000000000000000
On Mon, Jul 15, 2024 at 12:28:03AM -0400, Theodore Ts'o wrote:
> On Sun, Jul 14, 2024 at 11:46:24AM +0800, Zorro Lang wrote:
> >
> > A weird kernel panic on ext4 happened when I tried to test a
> > fstests patchset:
> > https://lore.kernel.org/fstests/20240712093341.ftesijixy2yrjlxx@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com/T/#med4b8d2fe14ef627519d84474b4cd1a25d386f75
>
> I'm confused; this patch set:
>
> Daniel Gomez (5):
> common/config: fix RECREATE_TEST_DEV initialization
> common/rc: add recreation support for tmpfs
> common/config: enable section parsing when recreation
> common/rc: read config section mount options for scratch devs
> common/rc: print test mount options
>
> seems to be mostly about how xfstest config section handling
> especially for tmpfs. Is this realy the right patch set? If so, I'm
> guessing that the reproducer would be very specific to the xfstests
> config.
>
> My {kvm,gce}-xfstest setup doesn't use the config sections at
> all, but instead uses shell script fragments, since it predates config
> sections by three years --- and I need something that works well with
> sharding separate configs to run on separate cloud VM's.
>
> So I'm not sure I'm going to be able to reprduce this easily using my
> test setup. Can you translate the stack trace to source file names /
> line numbers? Maybe that will give me a hint what's going on:
>
> > [35346.372867] Call Trace:
> > [35346.375319] <TASK>
> > [35346.377426] ? __die+0x20/0x70
> > [35346.380493] ? page_fault_oops+0x116/0x230
> > [35346.384602] ? __pfx_page_fault_oops+0x10/0x10
> > [35346.389048] ? _raw_spin_unlock+0x29/0x50
> > [35346.393072] ? rcu_is_watching+0x11/0xb0
> > [35346.397006] ? exc_page_fault+0x59/0xe0
> > [35346.400854] ? asm_exc_page_fault+0x22/0x30
> > [35346.405049] ? folio_mark_dirty+0x2a/0xf0
> > [35346.409072] __ext4_block_zero_page_range+0x50c/0x7b0 [ext4]
> > [35346.414809] ext4_truncate+0xcd3/0x1210 [ext4]
>
> Getting line numbers for these two functions would be especially
> helpful.
Sure, Ted. I reproduced this bug and got below things[1] on mainline linux
which HEAD=528dd46d0fc35c0176257a13a27d41e44fcc6cb3
And if you need, I pushed a temporary branch "whatamess4extN" to fstests
repo, which contains the patches trigger this bug.
Thanks,
Zorro
[1]
# ./scripts/decode_stacktrace.sh vmlinux <~/calltrace.log
[ 912.644200] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=131891
[ 912.645099] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=525225
[ 912.894856] EXT4-fs (vda2): unmounting filesystem b9690547-c193-4a82-b0df-4682bd621d3f.
[ 912.947581] EXT4-fs (vda2): 1 truncate cleaned up
[ 912.947892] EXT4-fs (vda2): recovery complete
[ 912.950912] EXT4-fs (vda2): mounted filesystem b9690547-c193-4a82-b0df-4682bd621d3f r/w with ordered data mode. Quota mode: none.
[ 912.994565] EXT4-fs warning (device vda2): ext4_convert_unwritten_extents_endio:3720: Inode (525267) finished: extent logical block 161, len 120; IO logical block 222, len 19
[ 912.997878] EXT4-fs warning (device vda2): ext4_convert_unwritten_extents_endio:3720: Inode (525267) finished: extent logical block 241, len 40; IO logical block 241, len 9
[ 914.017223] restraintd[1427]: *** Current Time: Sat Jul 13 15:03:01 2024 Localwatchdog at: Mon Jul 15 14:51:00 2024
[ 915.003343] EXT4-fs (vda2): shut down requested (2)
[ 915.003671] Aborting journal on device vda2-8.
[ 915.663314] EXT4-fs (vda2): unmounting filesystem b9690547-c193-4a82-b0df-4682bd621d3f.
[ 915.725813] EXT4-fs (vda2): INFO: recovery required on readonly filesystem
[ 915.726249] EXT4-fs (vda2): write access will be enabled during recovery
[ 916.035952] EXT4-fs (vda2): recovery complete
[ 916.038225] EXT4-fs (vda2): mounted filesystem b9690547-c193-4a82-b0df-4682bd621d3f ro with ordered data mode. Quota mode: none.
[ 916.059891] EXT4-fs (vda2): unmounting filesystem b9690547-c193-4a82-b0df-4682bd621d3f.
[ 916.159613] EXT4-fs (vda2): mounted filesystem b9690547-c193-4a82-b0df-4682bd621d3f r/w with ordered data mode. Quota mode: none.
[ 916.199256] EXT4-fs (vda2): shut down requested (2)
[ 916.199659] Aborting journal on device vda2-8.
[ 916.200912] EXT4-fs warning (device vda2): ext4_evict_inode:253: couldn't mark inode dirty (err -5)
[ 916.203621] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=263200
[ 916.205150] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=525058
[ 916.205868] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=524568
[ 916.206610] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=656330
[ 916.207979] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=131080
[ 916.208932] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=132089
[ 916.209218] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=656330
[ 916.210157] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=262970
[ 916.211213] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=263204
[ 916.211777] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=132089
[ 916.212301] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=132089
[ 916.214649] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=131374
[ 916.214786] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=524568
[ 916.216375] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=132004
[ 916.216881] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=132004
[ 916.217401] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=132004
[ 916.219891] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=131876
[ 916.221661] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=656082
[ 916.221743] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=131875
[ 916.223237] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=131875
[ 916.225723] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=131875
[ 916.230093] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=131860
[ 916.232398] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=525049
[ 916.233901] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=131871
[ 916.235671] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=131886
[ 916.238753] SELinux: inode_doinit_use_xattr: getxattr returned 5 for dev=vda2 ino=131891
[ 916.489675] EXT4-fs (vda2): unmounting filesystem b9690547-c193-4a82-b0df-4682bd621d3f.
[ 916.540454] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 916.540885] #PF: supervisor instruction fetch in kernel mode
[ 916.541226] #PF: error_code(0x0010) - not-present page
[ 916.541533] PGD 0 P4D 0
[ 916.541694] Oops: Oops: 0010 [#1] PREEMPT SMP KASAN PTI
[ 916.542451] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 916.542791] RIP: 0010:0x0
[ 916.542958] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
Code starting with the faulting instruction
===========================================
[ 916.543340] RSP: 0018:ffffc90008f0f648 EFLAGS: 00010246
[ 916.543650] RAX: 0000000000000000 RBX: ffff88818c487820 RCX: ffffffff951a6fea
[ 916.544069] RDX: 1ffffffff2ec8c6f RSI: ffffea0005130dc0 RDI: ffff88818c487a60
[ 916.544486] RBP: ffffea0005130dc0 R08: 0000000000000000 R09: fffff94000a261b8
[ 916.544903] R10: ffffea0005130dc7 R11: 0000000000000000 R12: 0000000000000216
[ 916.545326] R13: ffff88818c6822d0 R14: 0000000000000000 R15: 0000000000000000
[ 916.545743] FS: 00007ffa15285800(0000) GS:ffff8881f6600000(0000) knlGS:0000000000000000
[ 916.546214] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 916.546556] CR2: ffffffffffffffd6 CR3: 000000013914c004 CR4: 00000000003706f0
[ 916.546974] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 916.547393] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 916.547810] Call Trace:
[ 916.547964] <TASK>
[ 916.548102] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[ 916.548298] ? page_fault_oops (arch/x86/mm/fault.c:715)
[ 916.548547] ? __pfx_page_fault_oops (arch/x86/mm/fault.c:643)
[ 916.548815] ? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:103 ./include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:186)
[ 916.549069] ? rcu_is_watching (./include/linux/context_tracking.h:122 kernel/rcu/tree.c:724)
[ 916.549310] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:67 ./arch/x86/include/asm/irqflags.h:127 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
[ 916.549543] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
[ 916.549797] ? folio_mark_dirty (./arch/x86/include/asm/bitops.h:206 ./arch/x86/include/asm/bitops.h:238 ./include/asm-generic/bitops/instrumented-non-atomic.h:142 ./include/linux/page-flggs.h:562 mm/page-writeback.c:2880)
[ 916.550048] __ext4_block_zero_page_range (fs/ext4/inode.c:986 fs/ext4/inode.c:3679) ext4
[ 916.550453] ext4_truncate (fs/ext4/inode.c:3744 fs/ext4/inode.c:4119) ext4
[ 916.550779] ? ext4_process_orphan (fs/ext4/orphan.c:338 (discriminator 3)) ext4
[ 916.551142] ? __pfx_ext4_truncate (fs/ext4/inode.c:4070) ext4
[ 916.551490] ? __pfx_down_write (kernel/locking/rwsem.c:1577)
[ 916.551732] ? ext4_inode_is_fast_symlink (./arch/x86/include/asm/bitops.h:206 ./arch/x86/include/asm/bitops.h:238 ./include/asm-generic/bitops/instrumented-non-atomic.h:142 fs/ext4/ext4.h:1939 fs/ext4/ext4.h:3603 fs/ext4/inode.c:152 fs/ext4/inode.c:146) ext4
[ 916.552128] ext4_process_orphan (fs/ext4/orphan.c:339 (discriminator 3)) ext4
[ 916.552483] ext4_orphan_cleanup (fs/ext4/orphan.c:456) ext4
[ 916.552839] ? __pfx_ext4_orphan_cleanup (fs/ext4/orphan.c:381) ext4
[ 916.553220] ? is_module_address (./arch/x86/include/asm/preempt.h:103 kernel/module/main.c:3283)
[ 916.553471] __ext4_fill_super (fs/ext4/ext4.h:1763 fs/ext4/super.c:5554) ext4
[ 916.553828] ? __pfx___ext4_fill_super (fs/ext4/super.c:5181) ext4
[ 916.554203] ? __kmalloc_large_node (mm/slub.c:4081)
[ 916.554480] ? rcu_is_watching (./include/linux/context_tracking.h:122 kernel/rcu/tree.c:724)
[ 916.554718] ext4_fill_super (fs/ext4/super.c:5677) ext4
[ 916.555058] get_tree_bdev (fs/super.c:1624)
[ 916.555290] ? __pfx_ext4_fill_super (fs/ext4/super.c:5657) ext4
[ 916.555651] ? __pfx_get_tree_bdev (fs/super.c:1595)
[ 916.555910] ? security_sb_eat_lsm_opts (security/security.c:1361 (discriminator 13))
[ 916.556203] vfs_get_tree (fs/super.c:1789)
[ 916.556423] do_new_mount (fs/namespace.c:3352)
[ 916.556647] ? __pfx_do_new_mount (fs/namespace.c:3307)
[ 916.556897] ? security_capable (security/security.c:1036 (discriminator 13))
[ 916.557142] path_mount (fs/namespace.c:3679)
[ 916.557362] ? __pfx_path_mount (fs/namespace.c:3606)
[ 916.557602] ? user_path_at_empty (fs/namei.c:2933)
[ 916.557854] __x64_sys_mount (fs/namespace.c:3693 fs/namespace.c:3898 fs/namespace.c:3875 fs/namespace.c:3875)
[ 916.558093] ? __pfx___x64_sys_mount (fs/namespace.c:3875)
[ 916.558364] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
[ 916.558587] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4421)
[ 916.558854] ? do_syscall_64 (arch/x86/entry/common.c:102)
[ 916.559090] ? from_kuid_munged (kernel/user_namespace.c:460)
[ 916.559338] ? rcu_is_watching (./include/linux/context_tracking.h:122 kernel/rcu/tree.c:724)
[ 916.559574] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:4360)
[ 916.559879] ? do_syscall_64 (arch/x86/entry/common.c:102)
[ 916.560113] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4421)
[ 916.560377] ? do_syscall_64 (arch/x86/entry/common.c:102)
[ 916.560610] ? ktime_get_coarse_real_ts64 (./include/linux/seqlock.h:74 kernel/time/timekeeping.c:2264)
[ 916.560910] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4421)
[ 916.561252] ? rcu_is_watching (./include/linux/context_tracking.h:122 kernel/rcu/tree.c:724)
[ 916.561494] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:4360)
[ 916.561802] ? do_syscall_64 (arch/x86/entry/common.c:102)
[ 916.562040] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4421)
[ 916.562307] ? do_syscall_64 (arch/x86/entry/common.c:102)
[ 916.562541] ? clear_bhb_loop (arch/x86/entry/entry_64.S:1539)
[ 916.562774] ? clear_bhb_loop (arch/x86/entry/entry_64.S:1539)
[ 916.563011] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[ 916.563314] RIP: 0033:0x7ffa1510f03e
[ 916.563535] Code: 48 8b 0d e5 ad 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b2 ad 0e 00 f7 d8 64 89 01 48
All code
========
0: 48 8b 0d e5 ad 0e 00 mov 0xeade5(%rip),%rcx # 0xeadec
7: f7 d8 neg %eax
9: 64 89 01 mov %eax,%fs:(%rcx)
c: 48 83 c8 ff or $0xffffffffffffffff,%rax
10: c3 retq
11: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
18: 00 00 00
1b: 90 nop
1c: f3 0f 1e fa endbr64
20: 49 89 ca mov %rcx,%r10
23: b8 a5 00 00 00 mov $0xa5,%eax
28: 0f 05 syscall
2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
30: 73 01 jae 0x33
32: c3 retq
33: 48 8b 0d b2 ad 0e 00 mov 0xeadb2(%rip),%rcx # 0xeadec
3a: f7 d8 neg %eax
3c: 64 89 01 mov %eax,%fs:(%rcx)
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
6: 73 01 jae 0x9
8: c3 retq
9: 48 8b 0d b2 ad 0e 00 mov 0xeadb2(%rip),%rcx # 0xeadc2
10: f7 d8 neg %eax
12: 64 89 01 mov %eax,%fs:(%rcx)
15: 48 rex.W
[ 916.564607] RSP: 002b:00007ffc1e936e28 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[ 916.565054] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffa1510f03e
[ 916.565473] RDX: 00005555d6678630 RSI: 00005555d66786b0 RDI: 00005555d6678690
[ 916.565893] RBP: 00005555d6678400 R08: 00005555d6678650 R09: 00007ffc1e935b50
[ 916.566314] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 916.566733] R13: 00005555d6678630 R14: 00005555d6678690 R15: 00005555d6678400
[ 916.567160] </TASK>
[ 916.567301] Modules linked in: tls ext4 mbcache jbd2 rfkill snd_hda_codec_generic snd_hda_intel intel_rapl_msr snd_intel_dspcfg intel_rapl_common snd_intel_sdw_acpi snd_hda_codec snd_hda_core sunrpc intel_uncore_frequency_common snd_hwdep snd_seq intel_pmc_core snd_seq_device intel_vsec pmt_telemetry pmt_class snd_pcm qxl snd_timer pcspkr drm_ttm_helper ttm virtio_balloon snd soundcore drm_kms_helper i2c_piix4 joydev drm fuse xfs libcrc32c ata_generic virtio_net crct10dif_pclmul crc32_pclmul net_failover crc32c_intel failover ghash_clmulni_intel dimlib ata_piix virtio_console virtio_blk libata serio_raw
[ 916.570389] CR2: 0000000000000000
[ 916.570597] ---[ end trace 0000000000000000 ]---
[ 916.570876] RIP: 0010:0x0
[ 916.571045] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
Code starting with the faulting instruction
===========================================
[ 916.571428] RSP: 0018:ffffc90008f0f648 EFLAGS: 00010246
[ 916.571740] RAX: 0000000000000000 RBX: ffff88818c487820 RCX: ffffffff951a6fea
[ 916.572163] RDX: 1ffffffff2ec8c6f RSI: ffffea0005130dc0 RDI: ffff88818c487a60
[ 916.572583] RBP: ffffea0005130dc0 R08: 0000000000000000 R09: fffff94000a261b8
[ 916.573005] R10: ffffea0005130dc7 R11: 0000000000000000 R12: 0000000000000216
[ 916.573425] R13: ffff88818c6822d0 R14: 0000000000000000 R15: 0000000000000000
[ 916.573848] FS: 00007ffa15285800(0000) GS:ffff8881f6600000(0000) knlGS:0000000000000000
[ 916.574321] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 916.574664] CR2: ffffffffffffffd6 CR3: 000000013914c004 CR4: 00000000003706f0
[ 916.575087] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 916.575508] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 916.575928] note: mount[99339] exited with irqs disabled
[ 916.638225] EXT4-fs (vda3): unmounting filesystem 92fe26f7-76ab-4251-bac6-305c3e2ef932.
[ 916.816486] EXT4-fs (vda3): mounted filesystem 92fe26f7-76ab-4251-bac6-305c3e2ef932 r/w with ordered data mode. Quota mode: none.
>
> Thanks,
>
> - Ted
>
Powered by blists - more mailing lists