[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87o7di8o50.fsf@mailhost.krisman.be>
Date: Thu, 18 Jan 2024 15:58:19 -0300
From: Gabriel Krisman Bertazi <krisman@...e.de>
To: kernel test robot <oliver.sang@...el.com>
Cc: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
<linux-fscrypt@...r.kernel.org>, <viro@...iv.linux.org.uk>,
<ebiggers@...nel.org>, <jaegeuk@...nel.org>, <tytso@....edu>,
<linux-f2fs-devel@...ts.sourceforge.net>, <linux-ext4@...r.kernel.org>,
<linux-fsdevel@...r.kernel.org>, <amir73il@...il.com>
Subject: Re: [PATCH v3 04/10] fscrypt: Drop d_revalidate once the key is added
kernel test robot <oliver.sang@...el.com> writes:
> Hello,
>
> kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:
>
> commit: 1cfe4ba685d9eb6123648a0d9bef2d3d57b078ef ("[PATCH v3 04/10] fscrypt: Drop d_revalidate once the key is added")
> url: https://github.com/intel-lab-lkp/linux/commits/Gabriel-Krisman-Bertazi/ovl-Reject-mounting-case-insensitive-filesystems/20240112-070113
> base: https://git.kernel.org/cgit/linux/kernel/git/jaegeuk/f2fs.git dev-test
> patch link: https://lore.kernel.org/all/20240111225816.18117-5-krisman@suse.de/
> patch subject: [PATCH v3 04/10] fscrypt: Drop d_revalidate once the key is added
>
> in testcase: fxmark
> version: fxmark-x86_64-0ce9491-1_20220601
> with following parameters:
>
> disk: 1SSD
> media: ssd
> test: MWRL
> fstype: xfs
> directio: bufferedio
> cpufreq_governor: performance
>
>
>
> compiler: gcc-12
> test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@...el.com>
> | Closes: https://lore.kernel.org/oe-lkp/202401181648.4192e541-oliver.sang@intel.com
>
>
> [ 73.173380][ T6828] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [ 73.181338][ T6828] #PF: supervisor read access in kernel mode
> [ 73.187453][ T6828] #PF: error_code(0x0000) - not-present page
> [ 73.193566][ T6828] PGD 11cc47067 P4D 0
> [ 73.197762][ T6828] Oops: 0000 [#1] SMP NOPTI
> [ 73.202383][ T6828] CPU: 16 PID: 6828 Comm: fxmark Tainted: G S 6.7.0-rc1-00176-g1cfe4ba685d9 #1
> [ 73.212818][ T6828] Hardware name: Intel Corporation M50CYP2SB1U/M50CYP2SB1U, BIOS SE5C620.86B.01.01.0003.2104260124 04/26/2021
> [ 73.224837][ T6828] RIP: 0010:__d_move (include/linux/fscrypt.h:241 fs/dcache.c:3003)
> [ 73.229912][ T6828] Code: c1 00 00 00 08 0f 84 ed 00 00 00 81 e1 3f 10 07 00 0f 84 e1 00 00 00 80 cc 40 89 c1 81 e1 ff ff ff fd 41 89 4d 00 49 8b 4d 60 <48> 81 39 10 21 4e 81 0f 84 66 01 00 00 83 43 04 01 41 83 45 04 01
> All code
> ========
> 0: c1 00 00 roll $0x0,(%rax)
> 3: 00 08 add %cl,(%rax)
> 5: 0f 84 ed 00 00 00 je 0xf8
> b: 81 e1 3f 10 07 00 and $0x7103f,%ecx
> 11: 0f 84 e1 00 00 00 je 0xf8
> 17: 80 cc 40 or $0x40,%ah
> 1a: 89 c1 mov %eax,%ecx
> 1c: 81 e1 ff ff ff fd and $0xfdffffff,%ecx
> 22: 41 89 4d 00 mov %ecx,0x0(%r13)
> 26: 49 8b 4d 60 mov 0x60(%r13),%rcx
> 2a:* 48 81 39 10 21 4e 81 cmpq $0xffffffff814e2110,(%rcx) <-- trapping instruction
> 31: 0f 84 66 01 00 00 je 0x19d
> 37: 83 43 04 01 addl $0x1,0x4(%rbx)
> 3b: 41 83 45 04 01 addl $0x1,0x4(%r13)
>
> Code starting with the faulting instruction
> ===========================================
> 0: 48 81 39 10 21 4e 81 cmpq $0xffffffff814e2110,(%rcx)
> 7: 0f 84 66 01 00 00 je 0x173
> d: 83 43 04 01 addl $0x1,0x4(%rbx)
> 11: 41 83 45 04 01 addl $0x1,0x4(%r13)
> [ 73.249920][ T6828] RSP: 0018:ffa000000a99bce8 EFLAGS: 00010206
> [ 73.256134][ T6828] RAX: 0000000000480000 RBX: ff1100012cab5380 RCX: 0000000000000000
> [ 73.264248][ T6828] RDX: ff1100012cab4609 RSI: 0000000000000000 RDI: ff1100012cab4600
> [ 73.272366][ T6828] RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000020c
> [ 73.280473][ T6828] R10: ff110001622ddde0 R11: 0000000000010000 R12: 0000000000000000
> [ 73.288584][ T6828] R13: ff1100012cab4600 R14: 0000000000000000 R15: ff1100012cab5200
> [ 73.296699][ T6828] FS: 00007f1073011600(0000) GS:ff1100103f600000(0000) knlGS:0000000000000000
> [ 73.305766][ T6828] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 73.312488][ T6828] CR2: 0000000000000000 CR3: 000000012af2a006 CR4: 0000000000771ef0
> [ 73.320596][ T6828] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 73.328699][ T6828] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 73.336803][ T6828] PKRU: 55555554
> [ 73.340485][ T6828] Call Trace:
> [ 73.343900][ T6828] <TASK>
> [ 73.346960][ T6828] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
> [ 73.350983][ T6828] ? page_fault_oops (arch/x86/mm/fault.c:707)
> [ 73.355957][ T6828] ? exc_page_fault (arch/x86/include/asm/irqflags.h:37 arch/x86/include/asm/irqflags.h:72 arch/x86/mm/fault.c:1513 arch/x86/mm/fault.c:1561)
> [ 73.360837][ T6828] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:570)
> [ 73.365974][ T6828] ? __d_move (include/linux/fscrypt.h:241 fs/dcache.c:3003)
> [ 73.370410][ T6828] ? __d_move (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:457 include/linux/atomic/atomic-instrumented.h:33 include/asm-generic/qspinlock.h:57 include/linux/fsnotify_backend.h:580 fs/dcache.c:3002)
> [ 73.374846][ T6828] d_move (include/linux/seqlock.h:500 include/linux/seqlock.h:572 include/linux/seqlock.h:910 fs/dcache.c:3032)
> [ 73.378757][ T6828] vfs_rename (include/linux/fs.h:807 fs/namei.c:4864)
> [ 73.383189][ T6828] ? do_renameat2 (fs/namei.c:4996)
> [ 73.387963][ T6828] do_renameat2 (fs/namei.c:4996)
> [ 73.392568][ T6828] __x64_sys_rename (fs/namei.c:5040)
> [ 73.397336][ T6828] do_syscall_64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:82)
> [ 73.401835][ T6828] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
> [ 73.407817][ T6828] RIP: 0033:0x7f1072e83ed7
> [ 73.412325][ T6828] Code: e8 6e 82 09 00 85 c0 0f 95 c0 0f b6 c0 f7 d8 5d c3 66 90 b8 ff ff ff ff 5d c3 66 0f 1f 84 00 00 00 00 00 b8 52 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 89 8f 17 00 f7 d8 64 89 02 b8
> All code
> ========
> 0: e8 6e 82 09 00 callq 0x98273
> 5: 85 c0 test %eax,%eax
> 7: 0f 95 c0 setne %al
> a: 0f b6 c0 movzbl %al,%eax
> d: f7 d8 neg %eax
> f: 5d pop %rbp
> 10: c3 retq
> 11: 66 90 xchg %ax,%ax
> 13: b8 ff ff ff ff mov $0xffffffff,%eax
> 18: 5d pop %rbp
> 19: c3 retq
> 1a: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
> 21: 00 00
> 23: b8 52 00 00 00 mov $0x52,%eax
> 28: 0f 05 syscall
> 2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
> 30: 77 01 ja 0x33
> 32: c3 retq
> 33: 48 8b 15 89 8f 17 00 mov 0x178f89(%rip),%rdx # 0x178fc3
> 3a: f7 d8 neg %eax
> 3c: 64 89 02 mov %eax,%fs:(%rdx)
> 3f: b8 .byte 0xb8
>
Hm. So, the thing I missed here is that fscrypt_handle_d_move will be
called even by filesystems that don't support fscrypt. While we know
fscrypt filesystem dentries must have ->d_op, others might not have
it, and the dereferencing of dentry->d_op causes the oops at:
if (dentry->d_op->d_revalidate == fscrypt_d_revalidate)
causes the Oops.
One fix would be to prevent non-fscrypt filesystems from calling this
function. But since __d_move only touches the dentries, I think I'll
leave it as-is and just do:
if (dentry->d_op && dentry->d_op->d_revalidate)
sorry for the noise.
--
Gabriel Krisman Bertazi
Powered by blists - more mailing lists