[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<MEYP282MB2312A1362E60100C677C08ACC6072@MEYP282MB2312.AUSP282.PROD.OUTLOOK.COM>
Date: Fri, 20 Dec 2024 17:00:59 +0800
From: Levi Zim <rsworktech@...look.com>
To: John Fastabend <john.fastabend@...il.com>, Björn Töpel <bjorn@...nel.org>, Cong Wang <xiyou.wangcong@...il.com>
Cc: Jakub Sitnicki <jakub@...udflare.com>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, David Ahern <dsahern@...nel.org>,
netdev@...r.kernel.org, bpf@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
On 2024-12-20 15:56, John Fastabend wrote:
> Björn Töpel wrote:
>> Björn Töpel <bjorn@...nel.org> writes:
>>
>>> Levi Zim <rsworktech@...look.com> writes:
>>>
>>>> On 2024-12-04 09:01, Cong Wang wrote:
>>>>> On Sun, Dec 01, 2024 at 09:42:08AM +0800, Levi Zim wrote:
>>>>>> On 2024-11-30 21:38, Levi Zim via B4 Relay wrote:
>>>>>>> I found that bpf kselftest sockhash::test_txmsg_cork_hangs in
>>>>>>> test_sockmap.c triggers a kernel NULL pointer dereference:
>>>>> Interesting, I also ran this test recently and I didn't see such a
>>>>> crash.
>>>> I am also curious about why other people or the CI didn't hit such crash.
>>> FWIW, I'm hitting it on RISC-V:
>>>
>>> | Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000000008
>>> | Oops [#1]
>>> | Modules linked in: sch_fq_codel drm fuse drm_panel_orientation_quirks backlight
>>> | CPU: 7 UID: 0 PID: 732 Comm: test_sockmap Not tainted 6.13.0-rc3-00017-gf44d154d6e3d #1
>>> | Hardware name: riscv-virtio qemu/qemu, BIOS 2025.01-rc3-00042-gacab6e78aca7 01/01/2025
>>> | epc : splice_to_socket+0x376/0x49a
>>> | ra : splice_to_socket+0x37c/0x49a
>>> | epc : ffffffff803d9ffc ra : ffffffff803da002 sp : ff20000001c3b8b0
>>> | gp : ffffffff827aefa8 tp : ff60000083450040 t0 : ff6000008a12d001
>>> | t1 : 0000100100001001 t2 : 0000000000000000 s0 : ff20000001c3bae0
>>> | s1 : ffffffffffffefff a0 : ff6000008245e200 a1 : ff60000087dd0450
>>> | a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
>>> | a5 : 0000000000000000 a6 : ff20000001c3b450 a7 : ff6000008a12c004
>>> | s2 : 000000000000000f s3 : ff6000008245e2d0 s4 : ff6000008245e280
>>> | s5 : 0000000000000000 s6 : 0000000000000002 s7 : 0000000000001001
>>> | s8 : 0000000000003001 s9 : 0000000000000002 s10: 0000000000000002
>>> | s11: ff6000008245e200 t3 : ffffffff8001e78c t4 : 0000000000000000
>>> | t5 : 0000000000000000 t6 : ff6000008869f230
>>> | status: 0000000200000120 badaddr: 0000000000000008 cause: 000000000000000d
>>> | [<ffffffff803d9ffc>] splice_to_socket+0x376/0x49a
>>> | [<ffffffff803d8bc0>] direct_splice_actor+0x44/0x216
>>> | [<ffffffff803d8532>] splice_direct_to_actor+0xb6/0x1e8
>>> | [<ffffffff803d8780>] do_splice_direct+0x70/0xa2
>>> | [<ffffffff80392e40>] do_sendfile+0x26e/0x2d4
>>> | [<ffffffff803939d4>] __riscv_sys_sendfile64+0xf2/0x10e
>>> | [<ffffffff80fdfb64>] do_trap_ecall_u+0x1f8/0x26c
>>> | [<ffffffff80fedaee>] _new_vmalloc_restore_context_a0+0xc6/0xd2
>>> | Code: c5d8 9e35 c590 8bb3 40db eb01 6998 b823 0005 856e (6718) 2d05
>>> | ---[ end trace 0000000000000000 ]---
>>> | Kernel panic - not syncing: Fatal exception
>>> | SMP: stopping secondary CPUs
>>> | ---[ end Kernel panic - not syncing: Fatal exception ]---
>>>
>>> This is commit f44d154d6e3d ("Merge tag 'soc-fixes-6.13' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc").
>>>
>>> (Yet to bisect!)
>> Took the series for a run, and it does solve crash, but I'm getting
>> additional failures:
> Hi Bjorn,
>
> Thanks! I'm guessing those tests were failing even without the patch
> though right?
IIRC those kTLS tests were failing when I manually commented out the
cork hangs test that crashes the kernel.
>
> Thanks,
> John
>
>> | [TEST 298]: (512, 1, 3, sendpage, pass,pop (1,3),ktls,): socket(peer2) kTLS enabled
>> | socket(client1) kTLS enabled
>> | recv failed(): Invalid argument
>> | rx thread exited with err 1.
>> | FAILED
>> | [TEST 299]: (100, 1, 5, sendpage, pass,pop (1,3),ktls,): socket(peer2) kTLS enabled
>> | socket(client1) kTLS enabled
>> | recv failed(): Invalid argument
>> | rx thread exited with err 1.
>> | FAILED
>> | [TEST 300]: (2, 32, 8192, sendpage, pass,pop (4096,8192),ktls,): socket(peer2) kTLS enabled
>> | socket(client1) kTLS enabled
>> | recv failed(): Bad message
>> | rx thread exited with err 1.
>> | FAILED
>> | ...
>> | #42/ 9 sockhash:ktls:txmsg test pop-data:FAIL
>> | ...
>> | [TEST 308]: (2, 32, 8192, sendpage, pass,pop (5,21),ktls,): socket(peer2) kTLS enabled
>> | socket(client1) kTLS enabled
>> | recv failed(): Bad message
>> | rx thread exited with err 1.
>> | FAILED
>> | [TEST 309]: (2, 32, 8192, sendpage, pass,pop (1,11),ktls,): socket(peer2) kTLS enabled
>> | socket(client1) kTLS enabled
>> | recv failed(): Bad message
>> | rx thread exited with err 1.
>> | FAILED
>> | ...
>> | #43/ 6 sockhash:ktls:txmsg test push/pop data:FAIL
>>
>
Powered by blists - more mailing lists