lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <202301212330.48ec3159-yujie.liu@intel.com>
Date:   Sun, 22 Jan 2023 00:07:55 +0800
From:   kernel test robot <yujie.liu@...el.com>
To:     zhanchengbin <zhanchengbin1@...wei.com>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        Zhang Yi <yi.zhang@...wei.com>, <linux-ext4@...r.kernel.org>,
        <tytso@....edu>, <jack@...e.com>, <linfeilong@...wei.com>,
        <liuzhiqiang26@...wei.com>, zhanchengbin <zhanchengbin1@...wei.com>
Subject: Re: [PATCH 2/2] ext4: call ext4_handle_error when read extent failed
 in ext4_ext_insert_extent

Greeting,

FYI, we noticed kernel_BUG_at_fs/ext4/extents_status.c due to commit (built with gcc-11):

commit: ce10c493af382439876867dcaee89c7efddfab46 ("[PATCH 2/2] ext4: call ext4_handle_error when read extent failed in ext4_ext_insert_extent")
url: https://github.com/intel-lab-lkp/linux/commits/zhanchengbin/ext4-fix-inode-tree-inconsistency-caused-by-ENOMEM-in-ext4_split_extent_at/20230110-211157
base: https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git dev
patch link: https://lore.kernel.org/all/20230110133407.994711-3-zhanchengbin1@huawei.com/
patch subject: [PATCH 2/2] ext4: call ext4_handle_error when read extent failed in ext4_ext_insert_extent

in testcase: xfstests
version: xfstests-x86_64-fb6575e-1_20230116
with following parameters:

	disk: 4HDD
	fs: ext4
	fs2: smbv3
	test: generic-group-12

test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git

on test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 16G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


[  198.355648][ T1847] ------------[ cut here ]------------
[  198.360964][ T1847] kernel BUG at fs/ext4/extents_status.c:896!
[  198.366894][ T1847] invalid opcode: 0000 [#1] SMP KASAN PTI
[  198.372460][ T1847] CPU: 1 PID: 1847 Comm: smbd Not tainted 6.1.0-rc4-00062-gce10c493af38 #1
[  198.380891][ T1847] Hardware name: HP HP Z240 SFF Workstation/802E, BIOS N51 Ver. 01.63 10/05/2017
[ 198.389823][ T1847] RIP: 0010:ext4_es_cache_extent (fs/ext4/extents_status.c:896) 
[ 198.395662][ T1847] Code: 48 8b 0c 24 e9 98 fd ff ff 44 89 44 24 0c 48 89 0c 24 e8 b0 dc d0 ff 44 8b 44 24 0c 48 8b 0c 24 e9 55 fd ff ff e8 fd 30 97 01 <0f> 0b 48 c7 c7 40 ab 0f 85 e8 8f dc d0 ff e9 2d ff ff ff e8 85 dc
All code
========
   0:	48 8b 0c 24          	mov    (%rsp),%rcx
   4:	e9 98 fd ff ff       	jmpq   0xfffffffffffffda1
   9:	44 89 44 24 0c       	mov    %r8d,0xc(%rsp)
   e:	48 89 0c 24          	mov    %rcx,(%rsp)
  12:	e8 b0 dc d0 ff       	callq  0xffffffffffd0dcc7
  17:	44 8b 44 24 0c       	mov    0xc(%rsp),%r8d
  1c:	48 8b 0c 24          	mov    (%rsp),%rcx
  20:	e9 55 fd ff ff       	jmpq   0xfffffffffffffd7a
  25:	e8 fd 30 97 01       	callq  0x1973127
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	48 c7 c7 40 ab 0f 85 	mov    $0xffffffff850fab40,%rdi
  33:	e8 8f dc d0 ff       	callq  0xffffffffffd0dcc7
  38:	e9 2d ff ff ff       	jmpq   0xffffffffffffff6a
  3d:	e8                   	.byte 0xe8
  3e:	85 dc                	test   %ebx,%esp

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	48 c7 c7 40 ab 0f 85 	mov    $0xffffffff850fab40,%rdi
   9:	e8 8f dc d0 ff       	callq  0xffffffffffd0dc9d
   e:	e9 2d ff ff ff       	jmpq   0xffffffffffffff40
  13:	e8                   	.byte 0xe8
  14:	85 dc                	test   %ebx,%esp
[  198.415069][ T1847] RSP: 0018:ffffc90001ea76e8 EFLAGS: 00010203
[  198.420989][ T1847] RAX: 07ffffffffffffff RBX: 1ffff920003d4edf RCX: 07ffffffffffffff
[  198.428810][ T1847] RDX: 1ffff11022179815 RSI: 0000000000000050 RDI: ffff888110bcc0a8
[  198.436631][ T1847] RBP: 000000000000002f R08: 47ffffffffffffff R09: ffffc90001ea7693
[  198.444458][ T1847] R10: fffff520003d4ed2 R11: 0000000000000001 R12: ffff888429b96c20
[  198.452281][ T1847] R13: 0000000000000050 R14: dffffc0000000000 R15: ffff888110bcc000
[  198.460092][ T1847] FS:  00007fa7e95b0a40(0000) GS:ffff88837d080000(0000) knlGS:0000000000000000
[  198.468862][ T1847] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  198.475300][ T1847] CR2: 00007fa7ed951c28 CR3: 000000011a476005 CR4: 00000000003706e0
[  198.483096][ T1847] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  198.490900][ T1847] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  198.498707][ T1847] Call Trace:
[  198.501856][ T1847]  <TASK>
[ 198.504644][ T1847] ? check_heap_object (arch/x86/include/asm/bitops.h:207 arch/x86/include/asm/bitops.h:239 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/page-flags.h:782 include/linux/page-flags.h:803 include/linux/mm.h:732 include/linux/mm.h:1719 mm/usercopy.c:198) 
[ 198.509587][ T1847] ? ext4_es_insert_extent (fs/ext4/extents_status.c:880) 
[ 198.514877][ T1847] ? __check_object_size (mm/memremap.c:420) 
[ 198.520512][ T1847] ? ext4_find_extent (fs/ext4/extents.c:916) 
[ 198.525376][ T1847] ext4_cache_extents (fs/ext4/ext4_extents.h:207 fs/ext4/extents.c:539) 
[ 198.530255][ T1847] ? ext4_find_extent (fs/ext4/extents.c:916) 
[ 198.535132][ T1847] ext4_find_extent (fs/ext4/extents.c:815 fs/ext4/extents.c:954) 
[ 198.539823][ T1847] ext4_ext_map_blocks (fs/ext4/extents.c:4103) 
[ 198.544856][ T1847] ? sched_clock_cpu (kernel/sched/clock.c:369) 
[ 198.549546][ T1847] ? ext4_ext_release (fs/ext4/extents.c:4088) 
[ 198.554236][ T1847] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:186 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178) 
[ 198.559013][ T1847] ? raw_spin_rq_lock_nested (arch/x86/include/asm/preempt.h:85 kernel/sched/core.c:539) 
[ 198.564308][ T1847] ? release_sock (include/net/sock.h:1820 include/net/sock.h:1825 net/core/sock.c:3470) 
[ 198.568740][ T1847] ? tcp_recvmsg (net/ipv4/tcp.c:2682) 
[ 198.573081][ T1847] ? __cond_resched (kernel/sched/core.c:8325) 
[ 198.577600][ T1847] ? down_read (arch/x86/include/asm/atomic64_64.h:34 include/linux/atomic/atomic-long.h:41 include/linux/atomic/atomic-instrumented.h:1280 kernel/locking/rwsem.c:176 kernel/locking/rwsem.c:181 kernel/locking/rwsem.c:249 kernel/locking/rwsem.c:1259 kernel/locking/rwsem.c:1269 kernel/locking/rwsem.c:1511) 
[ 198.581857][ T1847] ? rwsem_down_read_slowpath (kernel/locking/rwsem.c:1507) 
[ 198.587408][ T1847] ? ext4_es_lookup_extent (arch/x86/include/asm/atomic.h:165 arch/x86/include/asm/atomic.h:178 include/linux/atomic/atomic-instrumented.h:147 include/asm-generic/qrwlock.h:113 include/linux/rwlock_api_smp.h:232 fs/ext4/extents_status.c:980) 
[ 198.592705][ T1847] ext4_map_blocks (fs/ext4/inode.c:576) 
[ 198.597403][ T1847] ? inet6_release (net/ipv6/af_inet6.c:673) 
[ 198.601846][ T1847] ? ext4_issue_zeroout (fs/ext4/inode.c:508) 
[ 198.606900][ T1847] ? jbd2_transaction_committed (arch/x86/include/asm/atomic.h:165 arch/x86/include/asm/atomic.h:178 include/linux/atomic/atomic-instrumented.h:147 include/asm-generic/qrwlock.h:113 include/linux/rwlock_api_smp.h:232 fs/jbd2/journal.c:813) 
[ 198.612547][ T1847] ? ext4_set_iomap (arch/x86/include/asm/bitops.h:207 arch/x86/include/asm/bitops.h:239 include/asm-generic/bitops/instrumented-non-atomic.h:142 fs/ext4/ext4.h:1923 fs/ext4/inode.c:3421) 
[ 198.617232][ T1847] ext4_iomap_begin_report (fs/ext4/inode.c:3672) 
[ 198.622544][ T1847] ? mpage_map_one_extent (fs/ext4/inode.c:3631) 
[ 198.627761][ T1847] ? mpage_map_one_extent (fs/ext4/inode.c:3631) 
[ 198.632971][ T1847] ? __copy_msghdr (net/socket.c:2409) 
[ 198.637571][ T1847] ? from_kgid_munged (kernel/user_namespace.c:524) 
[ 198.642353][ T1847] ? __might_fault (mm/memory.c:5648) 
[ 198.646795][ T1847] ? memset (mm/kasan/shadow.c:44) 
[ 198.650623][ T1847] ? mpage_map_one_extent (fs/ext4/inode.c:3631) 
[ 198.655849][ T1847] iomap_iter (fs/iomap/iter.c:76) 
[ 198.660032][ T1847] ? iomap_iter (fs/iomap/iter.c:71) 
[ 198.664375][ T1847] iomap_seek_hole (fs/iomap/seek.c:49) 
[ 198.668989][ T1847] ? iomap_fiemap (fs/iomap/seek.c:35) 
[ 198.673501][ T1847] ? rwsem_down_read_slowpath (kernel/locking/rwsem.c:1507) 
[ 198.679068][ T1847] ? mutex_lock (arch/x86/include/asm/atomic64_64.h:190 include/linux/atomic/atomic-long.h:443 include/linux/atomic/atomic-instrumented.h:1781 kernel/locking/mutex.c:171 kernel/locking/mutex.c:285) 
[ 198.683247][ T1847] ? __mutex_lock_slowpath (kernel/locking/mutex.c:282) 
[ 198.688381][ T1847] ext4_llseek (include/linux/fs.h:771 fs/ext4/file.c:919) 
[ 198.692638][ T1847] ksys_lseek (fs/read_write.c:289 fs/read_write.c:302) 
[ 198.696720][ T1847] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) 
[ 198.700976][ T1847] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) 
[  198.706705][ T1847] RIP: 0033:0x7fa7ed7d5647
[ 198.710976][ T1847] Code: ff ff ff ff c3 66 0f 1f 44 00 00 48 8b 15 79 a9 00 00 f7 d8 64 89 02 b8 ff ff ff ff eb b8 0f 1f 44 00 00 b8 08 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 51 a9 00 00 f7 d8 64 89 02 48
All code
========
   0:	ff                   	(bad)  
   1:	ff                   	(bad)  
   2:	ff                   	(bad)  
   3:	ff c3                	inc    %ebx
   5:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
   b:	48 8b 15 79 a9 00 00 	mov    0xa979(%rip),%rdx        # 0xa98b
  12:	f7 d8                	neg    %eax
  14:	64 89 02             	mov    %eax,%fs:(%rdx)
  17:	b8 ff ff ff ff       	mov    $0xffffffff,%eax
  1c:	eb b8                	jmp    0xffffffffffffffd6
  1e:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  23:	b8 08 00 00 00       	mov    $0x8,%eax
  28:	0f 05                	syscall 
  2a:*	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax		<-- trapping instruction
  30:	77 01                	ja     0x33
  32:	c3                   	retq   
  33:	48 8b 15 51 a9 00 00 	mov    0xa951(%rip),%rdx        # 0xa98b
  3a:	f7 d8                	neg    %eax
  3c:	64 89 02             	mov    %eax,%fs:(%rdx)
  3f:	48                   	rex.W

Code starting with the faulting instruction
===========================================
   0:	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax
   6:	77 01                	ja     0x9
   8:	c3                   	retq   
   9:	48 8b 15 51 a9 00 00 	mov    0xa951(%rip),%rdx        # 0xa961
  10:	f7 d8                	neg    %eax
  12:	64 89 02             	mov    %eax,%fs:(%rdx)
  15:	48                   	rex.W


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <yujie.liu@...el.com>
| Link: https://lore.kernel.org/oe-lkp/202301212330.48ec3159-yujie.liu@intel.com


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

View attachment "config-6.1.0-rc4-00062-gce10c493af38" of type "text/plain" (171137 bytes)

View attachment "job-script" of type "text/plain" (5981 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (44072 bytes)

View attachment "xfstests" of type "text/plain" (186712 bytes)

View attachment "job.yaml" of type "text/plain" (4625 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ