[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <80f28cfb-f287-419b-a448-b5967bc778ae@gmail.com>
Date: Sat, 22 Jun 2024 15:12:43 +0900
From: Yunseong Kim <yskelg@...il.com>
To: Taehee Yoo <ap420073@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, Pedro Tammela <pctammela@...atatu.com>,
netdev@...r.kernel.org, stable@...r.kernel.org,
Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu
<mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Takashi Iwai <tiwai@...e.de>, "David S. Miller" <davem@...emloft.net>,
Thomas Hellström <thomas.hellstrom@...ux.intel.com>,
"Rafael J. Wysocki" <rafael@...nel.org>, Jamal Hadi Salim
<jhs@...atatu.com>, Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>, Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>, Austin Kim <austindh.kim@...il.com>,
shjy180909@...il.com, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, ppbuk5246@...il.com,
Yeoreum Yun <yeoreum.yun@....com>, virtualization@...ts.linux.dev
Subject: Re: [PATCH v3] tracing/net_sched: NULL pointer dereference in
perf_trace_qdisc_reset()
Hi Taehee,
On 6/22/24 2:50 오후, Taehee Yoo wrote:
> On Sat, Jun 22, 2024 at 1:58 PM <yskelg@...il.com> wrote:
>>
>> From: Yunseong Kim <yskelg@...il.com>
>>
>
> Hi Yunseong,
> Thanks a lot for this work!
Thank you Taehee for reviewing our patch. It's greatly appreciated.
>> During qdisc initialization, qdisc was being set to noop_queue.
>> In veth_init_queue, the initial tx_num was reduced back to one,
>> causing the qdisc reset to be called with noop, which led to the kernel panic.
>>
>> I've attached the GitHub gist link that C converted syz-execprogram
>> source code and 3 log of reproduced vmcore-dmesg.
>>
>> https://gist.github.com/yskelg/cc64562873ce249cdd0d5a358b77d740
>>
>> Yeoreum and I use two fuzzing tool simultaneously.
>>
>> One process with syz-executor : https://github.com/google/syzkaller
>>
>> $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \
>> -enable=none -collide=false log1
>>
>> The other process with perf fuzzer:
>> https://github.com/deater/perf_event_tests/tree/master/fuzzer
>>
>> $ perf_event_tests/fuzzer/perf_fuzzer
>>
>> I think this will happen on the kernel version.
>>
>> Linux kernel version +v6.7.10, +v6.8, +v6.9 and it could happen in v6.10.
>>
>> This occurred from 51270d573a8d. I think this patch is absolutely
>> necessary. Previously, It was showing not intended string value of name.
> I found a simple reproducer, please use the below command to test this patch.
>
> echo 1 > /sys/kernel/debug/tracing/events/enable
> ip link add veth0 type veth peer name veth1
The perf event is activated by perf_fuzzer, and it's indeed a similar
environment with veth.
> In my machine, the splat looks like:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000130
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 1 PID: 1207 Comm: ip Not tainted 6.10.0-rc4+ #25
> 362ec22a686962a9936425abea9a73f03b445c0c
> Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
> RIP: 0010:strlen+0x0/0x20
> Code: f7 75 ec 31 c0 c3 cc cc cc cc 48 89 f8 c3 cc cc cc cc 0f 1f 84
> 00 00 00 00 00 90 90 90 90 9c
> RSP: 0018:ffffbed8435c7630 EFLAGS: 00010206
> RAX: ffffffff92d629c0 RBX: ffffa14100185c60 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffffffff92d62840 RDI: 0000000000000130
> RBP: ffffffff92dc4600 R08: 0000000000000fd0 R09: 0000000000000010
> R10: ffffffff92c66c98 R11: 0000000000000001 R12: 0000000000000001
> R13: 0000000000000000 R14: 0000000000000130 R15: ffffffff92d62840
> FS: 00007f6a94e50b80(0000) GS:ffffa1485f680000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000130 CR3: 0000000103414000 CR4: 00000000007506f0
> PKRU: 55555554
> Call Trace:
> <TASK>
> ? __die+0x20/0x70
> ? page_fault_oops+0x15a/0x460
> ? trace_event_raw_event_x86_exceptions+0x5f/0xa0
> ? exc_page_fault+0x6e/0x180
> ? asm_exc_page_fault+0x22/0x30
> ? __pfx_strlen+0x10/0x10
> trace_event_raw_event_qdisc_reset+0x4d/0x180
> ? synchronize_rcu_expedited+0x215/0x240
> ? __pfx_autoremove_wake_function+0x10/0x10
> qdisc_reset+0x130/0x150
> netif_set_real_num_tx_queues+0xe3/0x1e0
> veth_init_queues+0x44/0x70 [veth 24a9dd1cd1b1b279e1b467ad46d47a753799b428]
> veth_newlink+0x22b/0x440 [veth 24a9dd1cd1b1b279e1b467ad46d47a753799b428]
> __rtnl_newlink+0x718/0x990
> rtnl_newlink+0x44/0x70
> rtnetlink_rcv_msg+0x159/0x410
> ? kmalloc_reserve+0x90/0xf0
> ? trace_event_raw_event_kmem_cache_alloc+0x87/0xe0
> ? __pfx_rtnetlink_rcv_msg+0x10/0x10
> netlink_rcv_skb+0x54/0x100
> netlink_unicast+0x243/0x370
> netlink_sendmsg+0x1bb/0x3e0
> ____sys_sendmsg+0x2bb/0x320
> ? copy_msghdr_from_user+0x6d/0xa0
> ___sys_sendmsg+0x88/0xd0
>
> Thanks a lot!
> Taehee Yoo
I think this bug might cause inconvenience for developers working on net
devices driver in a virtual machine when they use tracing.
I'm appreciate your effort in reproducing it.
Warm Regards,
Yunseong Kim
Powered by blists - more mailing lists