[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <202301212330.48ec3159-yujie.liu@intel.com>
Date: Sun, 22 Jan 2023 00:07:55 +0800
From: kernel test robot <yujie.liu@...el.com>
To: zhanchengbin <zhanchengbin1@...wei.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
Zhang Yi <yi.zhang@...wei.com>, <linux-ext4@...r.kernel.org>,
<tytso@....edu>, <jack@...e.com>, <linfeilong@...wei.com>,
<liuzhiqiang26@...wei.com>, zhanchengbin <zhanchengbin1@...wei.com>
Subject: Re: [PATCH 2/2] ext4: call ext4_handle_error when read extent failed
in ext4_ext_insert_extent
Greeting,
FYI, we noticed kernel_BUG_at_fs/ext4/extents_status.c due to commit (built with gcc-11):
commit: ce10c493af382439876867dcaee89c7efddfab46 ("[PATCH 2/2] ext4: call ext4_handle_error when read extent failed in ext4_ext_insert_extent")
url: https://github.com/intel-lab-lkp/linux/commits/zhanchengbin/ext4-fix-inode-tree-inconsistency-caused-by-ENOMEM-in-ext4_split_extent_at/20230110-211157
base: https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git dev
patch link: https://lore.kernel.org/all/20230110133407.994711-3-zhanchengbin1@huawei.com/
patch subject: [PATCH 2/2] ext4: call ext4_handle_error when read extent failed in ext4_ext_insert_extent
in testcase: xfstests
version: xfstests-x86_64-fb6575e-1_20230116
with following parameters:
disk: 4HDD
fs: ext4
fs2: smbv3
test: generic-group-12
test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
on test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 16G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
[ 198.355648][ T1847] ------------[ cut here ]------------
[ 198.360964][ T1847] kernel BUG at fs/ext4/extents_status.c:896!
[ 198.366894][ T1847] invalid opcode: 0000 [#1] SMP KASAN PTI
[ 198.372460][ T1847] CPU: 1 PID: 1847 Comm: smbd Not tainted 6.1.0-rc4-00062-gce10c493af38 #1
[ 198.380891][ T1847] Hardware name: HP HP Z240 SFF Workstation/802E, BIOS N51 Ver. 01.63 10/05/2017
[ 198.389823][ T1847] RIP: 0010:ext4_es_cache_extent (fs/ext4/extents_status.c:896)
[ 198.395662][ T1847] Code: 48 8b 0c 24 e9 98 fd ff ff 44 89 44 24 0c 48 89 0c 24 e8 b0 dc d0 ff 44 8b 44 24 0c 48 8b 0c 24 e9 55 fd ff ff e8 fd 30 97 01 <0f> 0b 48 c7 c7 40 ab 0f 85 e8 8f dc d0 ff e9 2d ff ff ff e8 85 dc
All code
========
0: 48 8b 0c 24 mov (%rsp),%rcx
4: e9 98 fd ff ff jmpq 0xfffffffffffffda1
9: 44 89 44 24 0c mov %r8d,0xc(%rsp)
e: 48 89 0c 24 mov %rcx,(%rsp)
12: e8 b0 dc d0 ff callq 0xffffffffffd0dcc7
17: 44 8b 44 24 0c mov 0xc(%rsp),%r8d
1c: 48 8b 0c 24 mov (%rsp),%rcx
20: e9 55 fd ff ff jmpq 0xfffffffffffffd7a
25: e8 fd 30 97 01 callq 0x1973127
2a:* 0f 0b ud2 <-- trapping instruction
2c: 48 c7 c7 40 ab 0f 85 mov $0xffffffff850fab40,%rdi
33: e8 8f dc d0 ff callq 0xffffffffffd0dcc7
38: e9 2d ff ff ff jmpq 0xffffffffffffff6a
3d: e8 .byte 0xe8
3e: 85 dc test %ebx,%esp
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 48 c7 c7 40 ab 0f 85 mov $0xffffffff850fab40,%rdi
9: e8 8f dc d0 ff callq 0xffffffffffd0dc9d
e: e9 2d ff ff ff jmpq 0xffffffffffffff40
13: e8 .byte 0xe8
14: 85 dc test %ebx,%esp
[ 198.415069][ T1847] RSP: 0018:ffffc90001ea76e8 EFLAGS: 00010203
[ 198.420989][ T1847] RAX: 07ffffffffffffff RBX: 1ffff920003d4edf RCX: 07ffffffffffffff
[ 198.428810][ T1847] RDX: 1ffff11022179815 RSI: 0000000000000050 RDI: ffff888110bcc0a8
[ 198.436631][ T1847] RBP: 000000000000002f R08: 47ffffffffffffff R09: ffffc90001ea7693
[ 198.444458][ T1847] R10: fffff520003d4ed2 R11: 0000000000000001 R12: ffff888429b96c20
[ 198.452281][ T1847] R13: 0000000000000050 R14: dffffc0000000000 R15: ffff888110bcc000
[ 198.460092][ T1847] FS: 00007fa7e95b0a40(0000) GS:ffff88837d080000(0000) knlGS:0000000000000000
[ 198.468862][ T1847] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 198.475300][ T1847] CR2: 00007fa7ed951c28 CR3: 000000011a476005 CR4: 00000000003706e0
[ 198.483096][ T1847] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 198.490900][ T1847] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 198.498707][ T1847] Call Trace:
[ 198.501856][ T1847] <TASK>
[ 198.504644][ T1847] ? check_heap_object (arch/x86/include/asm/bitops.h:207 arch/x86/include/asm/bitops.h:239 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/page-flags.h:782 include/linux/page-flags.h:803 include/linux/mm.h:732 include/linux/mm.h:1719 mm/usercopy.c:198)
[ 198.509587][ T1847] ? ext4_es_insert_extent (fs/ext4/extents_status.c:880)
[ 198.514877][ T1847] ? __check_object_size (mm/memremap.c:420)
[ 198.520512][ T1847] ? ext4_find_extent (fs/ext4/extents.c:916)
[ 198.525376][ T1847] ext4_cache_extents (fs/ext4/ext4_extents.h:207 fs/ext4/extents.c:539)
[ 198.530255][ T1847] ? ext4_find_extent (fs/ext4/extents.c:916)
[ 198.535132][ T1847] ext4_find_extent (fs/ext4/extents.c:815 fs/ext4/extents.c:954)
[ 198.539823][ T1847] ext4_ext_map_blocks (fs/ext4/extents.c:4103)
[ 198.544856][ T1847] ? sched_clock_cpu (kernel/sched/clock.c:369)
[ 198.549546][ T1847] ? ext4_ext_release (fs/ext4/extents.c:4088)
[ 198.554236][ T1847] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:186 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178)
[ 198.559013][ T1847] ? raw_spin_rq_lock_nested (arch/x86/include/asm/preempt.h:85 kernel/sched/core.c:539)
[ 198.564308][ T1847] ? release_sock (include/net/sock.h:1820 include/net/sock.h:1825 net/core/sock.c:3470)
[ 198.568740][ T1847] ? tcp_recvmsg (net/ipv4/tcp.c:2682)
[ 198.573081][ T1847] ? __cond_resched (kernel/sched/core.c:8325)
[ 198.577600][ T1847] ? down_read (arch/x86/include/asm/atomic64_64.h:34 include/linux/atomic/atomic-long.h:41 include/linux/atomic/atomic-instrumented.h:1280 kernel/locking/rwsem.c:176 kernel/locking/rwsem.c:181 kernel/locking/rwsem.c:249 kernel/locking/rwsem.c:1259 kernel/locking/rwsem.c:1269 kernel/locking/rwsem.c:1511)
[ 198.581857][ T1847] ? rwsem_down_read_slowpath (kernel/locking/rwsem.c:1507)
[ 198.587408][ T1847] ? ext4_es_lookup_extent (arch/x86/include/asm/atomic.h:165 arch/x86/include/asm/atomic.h:178 include/linux/atomic/atomic-instrumented.h:147 include/asm-generic/qrwlock.h:113 include/linux/rwlock_api_smp.h:232 fs/ext4/extents_status.c:980)
[ 198.592705][ T1847] ext4_map_blocks (fs/ext4/inode.c:576)
[ 198.597403][ T1847] ? inet6_release (net/ipv6/af_inet6.c:673)
[ 198.601846][ T1847] ? ext4_issue_zeroout (fs/ext4/inode.c:508)
[ 198.606900][ T1847] ? jbd2_transaction_committed (arch/x86/include/asm/atomic.h:165 arch/x86/include/asm/atomic.h:178 include/linux/atomic/atomic-instrumented.h:147 include/asm-generic/qrwlock.h:113 include/linux/rwlock_api_smp.h:232 fs/jbd2/journal.c:813)
[ 198.612547][ T1847] ? ext4_set_iomap (arch/x86/include/asm/bitops.h:207 arch/x86/include/asm/bitops.h:239 include/asm-generic/bitops/instrumented-non-atomic.h:142 fs/ext4/ext4.h:1923 fs/ext4/inode.c:3421)
[ 198.617232][ T1847] ext4_iomap_begin_report (fs/ext4/inode.c:3672)
[ 198.622544][ T1847] ? mpage_map_one_extent (fs/ext4/inode.c:3631)
[ 198.627761][ T1847] ? mpage_map_one_extent (fs/ext4/inode.c:3631)
[ 198.632971][ T1847] ? __copy_msghdr (net/socket.c:2409)
[ 198.637571][ T1847] ? from_kgid_munged (kernel/user_namespace.c:524)
[ 198.642353][ T1847] ? __might_fault (mm/memory.c:5648)
[ 198.646795][ T1847] ? memset (mm/kasan/shadow.c:44)
[ 198.650623][ T1847] ? mpage_map_one_extent (fs/ext4/inode.c:3631)
[ 198.655849][ T1847] iomap_iter (fs/iomap/iter.c:76)
[ 198.660032][ T1847] ? iomap_iter (fs/iomap/iter.c:71)
[ 198.664375][ T1847] iomap_seek_hole (fs/iomap/seek.c:49)
[ 198.668989][ T1847] ? iomap_fiemap (fs/iomap/seek.c:35)
[ 198.673501][ T1847] ? rwsem_down_read_slowpath (kernel/locking/rwsem.c:1507)
[ 198.679068][ T1847] ? mutex_lock (arch/x86/include/asm/atomic64_64.h:190 include/linux/atomic/atomic-long.h:443 include/linux/atomic/atomic-instrumented.h:1781 kernel/locking/mutex.c:171 kernel/locking/mutex.c:285)
[ 198.683247][ T1847] ? __mutex_lock_slowpath (kernel/locking/mutex.c:282)
[ 198.688381][ T1847] ext4_llseek (include/linux/fs.h:771 fs/ext4/file.c:919)
[ 198.692638][ T1847] ksys_lseek (fs/read_write.c:289 fs/read_write.c:302)
[ 198.696720][ T1847] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[ 198.700976][ T1847] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[ 198.706705][ T1847] RIP: 0033:0x7fa7ed7d5647
[ 198.710976][ T1847] Code: ff ff ff ff c3 66 0f 1f 44 00 00 48 8b 15 79 a9 00 00 f7 d8 64 89 02 b8 ff ff ff ff eb b8 0f 1f 44 00 00 b8 08 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 51 a9 00 00 f7 d8 64 89 02 48
All code
========
0: ff (bad)
1: ff (bad)
2: ff (bad)
3: ff c3 inc %ebx
5: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
b: 48 8b 15 79 a9 00 00 mov 0xa979(%rip),%rdx # 0xa98b
12: f7 d8 neg %eax
14: 64 89 02 mov %eax,%fs:(%rdx)
17: b8 ff ff ff ff mov $0xffffffff,%eax
1c: eb b8 jmp 0xffffffffffffffd6
1e: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
23: b8 08 00 00 00 mov $0x8,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 01 ja 0x33
32: c3 retq
33: 48 8b 15 51 a9 00 00 mov 0xa951(%rip),%rdx # 0xa98b
3a: f7 d8 neg %eax
3c: 64 89 02 mov %eax,%fs:(%rdx)
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 01 ja 0x9
8: c3 retq
9: 48 8b 15 51 a9 00 00 mov 0xa951(%rip),%rdx # 0xa961
10: f7 d8 neg %eax
12: 64 89 02 mov %eax,%fs:(%rdx)
15: 48 rex.W
If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <yujie.liu@...el.com>
| Link: https://lore.kernel.org/oe-lkp/202301212330.48ec3159-yujie.liu@intel.com
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
View attachment "config-6.1.0-rc4-00062-gce10c493af38" of type "text/plain" (171137 bytes)
View attachment "job-script" of type "text/plain" (5981 bytes)
Download attachment "dmesg.xz" of type "application/x-xz" (44072 bytes)
View attachment "xfstests" of type "text/plain" (186712 bytes)
View attachment "job.yaml" of type "text/plain" (4625 bytes)
Powered by blists - more mailing lists