[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6bc2bab66d3bc7aebbde92d4f268effe6b62db35.camel@redhat.com>
Date: Tue, 12 Mar 2024 12:04:29 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Kuniyuki Iwashima <kuni1840@...il.com>, netdev@...r.kernel.org,
linux-rdma@...r.kernel.org, rds-devel@....oracle.com, syzkaller
<syzkaller@...glegroups.com>, Kuniyuki Iwashima <kuniyu@...zon.com>, "David
S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, Allison
Henderson <allison.henderson@...cle.com>
Subject: Re: [PATCH v5 net 2/2] rds: tcp: Fix use-after-free of net in
reqsk_timer_handler().
On Fri, 2024-03-08 at 12:01 -0800, Kuniyuki Iwashima wrote:
> syzkaller reported a warning of netns tracker [0] followed by KASAN
> splat [1] and another ref tracker warning [1].
>
> syzkaller could not find a repro, but in the log, the only suspicious
> sequence was as follows:
>
> 18:26:22 executing program 1:
> r0 = socket$inet6_mptcp(0xa, 0x1, 0x106)
> ...
> connect$inet6(r0, &(0x7f0000000080)={0xa, 0x4001, 0x0, @loopback}, 0x1c) (async)
>
> The notable thing here is 0x4001 in connect(), which is RDS_TCP_PORT.
>
> So, the scenario would be:
>
> 1. unshare(CLONE_NEWNET) creates a per netns tcp listener in
> rds_tcp_listen_init().
> 2. syz-executor connect()s to it and creates a reqsk.
> 3. syz-executor exit()s immediately.
> 4. netns is dismantled. [0]
> 5. reqsk timer is fired, and UAF happens while freeing reqsk. [1]
> 6. listener is freed after RCU grace period. [2]
>
> Basically, reqsk assumes that the listener guarantees netns safety
> until all reqsk timers are expired by holding the listener's refcount.
> However, this was not the case for kernel sockets.
>
> Commit 740ea3c4a0b2 ("tcp: Clean up kernel listener's reqsk in
> inet_twsk_purge()") fixed this issue only for per-netns ehash.
>
> Let's apply the same fix for the global ehash.
>
> [0]:
> ref_tracker: net notrefcnt@...0000065449cc3 has 1/1 users at
> sk_alloc (./include/net/net_namespace.h:337 net/core/sock.c:2146)
> inet6_create (net/ipv6/af_inet6.c:192 net/ipv6/af_inet6.c:119)
> __sock_create (net/socket.c:1572)
> rds_tcp_listen_init (net/rds/tcp_listen.c:279)
> rds_tcp_init_net (net/rds/tcp.c:577)
> ops_init (net/core/net_namespace.c:137)
> setup_net (net/core/net_namespace.c:340)
> copy_net_ns (net/core/net_namespace.c:497)
> create_new_namespaces (kernel/nsproxy.c:110)
> unshare_nsproxy_namespaces (kernel/nsproxy.c:228 (discriminator 4))
> ksys_unshare (kernel/fork.c:3429)
> __x64_sys_unshare (kernel/fork.c:3496)
> do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
> entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
> ...
> WARNING: CPU: 0 PID: 27 at lib/ref_tracker.c:179 ref_tracker_dir_exit (lib/ref_tracker.c:179)
>
> [1]:
> BUG: KASAN: slab-use-after-free in inet_csk_reqsk_queue_drop (./include/net/inet_hashtables.h:180 net/ipv4/inet_connection_sock.c:952 net/ipv4/inet_connection_sock.c:966)
> Read of size 8 at addr ffff88801b370400 by task swapper/0/0
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> Call Trace:
> <IRQ>
> dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
> print_report (mm/kasan/report.c:378 mm/kasan/report.c:488)
> kasan_report (mm/kasan/report.c:603)
> inet_csk_reqsk_queue_drop (./include/net/inet_hashtables.h:180 net/ipv4/inet_connection_sock.c:952 net/ipv4/inet_connection_sock.c:966)
> reqsk_timer_handler (net/ipv4/inet_connection_sock.c:979 net/ipv4/inet_connection_sock.c:1092)
> call_timer_fn (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/timer.h:127 kernel/time/timer.c:1701)
> __run_timers.part.0 (kernel/time/timer.c:1752 kernel/time/timer.c:2038)
> run_timer_softirq (kernel/time/timer.c:2053)
> __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:554)
> irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632 kernel/softirq.c:644)
> sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1076 (discriminator 14))
> </IRQ>
>
> Allocated by task 258 on cpu 0 at 83.612050s:
> kasan_save_stack (mm/kasan/common.c:48)
> kasan_save_track (mm/kasan/common.c:68)
> __kasan_slab_alloc (mm/kasan/common.c:343)
> kmem_cache_alloc (mm/slub.c:3813 mm/slub.c:3860 mm/slub.c:3867)
> copy_net_ns (./include/linux/slab.h:701 net/core/net_namespace.c:421 net/core/net_namespace.c:480)
> create_new_namespaces (kernel/nsproxy.c:110)
> unshare_nsproxy_namespaces (kernel/nsproxy.c:228 (discriminator 4))
> ksys_unshare (kernel/fork.c:3429)
> __x64_sys_unshare (kernel/fork.c:3496)
> do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
> entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
>
> Freed by task 27 on cpu 0 at 329.158864s:
> kasan_save_stack (mm/kasan/common.c:48)
> kasan_save_track (mm/kasan/common.c:68)
> kasan_save_free_info (mm/kasan/generic.c:643)
> __kasan_slab_free (mm/kasan/common.c:265)
> kmem_cache_free (mm/slub.c:4299 mm/slub.c:4363)
> cleanup_net (net/core/net_namespace.c:456 net/core/net_namespace.c:446 net/core/net_namespace.c:639)
> process_one_work (kernel/workqueue.c:2638)
> worker_thread (kernel/workqueue.c:2700 kernel/workqueue.c:2787)
> kthread (kernel/kthread.c:388)
> ret_from_fork (arch/x86/kernel/process.c:153)
> ret_from_fork_asm (arch/x86/entry/entry_64.S:250)
>
> The buggy address belongs to the object at ffff88801b370000
> which belongs to the cache net_namespace of size 4352
> The buggy address is located 1024 bytes inside of
> freed 4352-byte region [ffff88801b370000, ffff88801b371100)
>
> [2]:
> WARNING: CPU: 0 PID: 95 at lib/ref_tracker.c:228 ref_tracker_free (lib/ref_tracker.c:228 (discriminator 1))
> Modules linked in:
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> RIP: 0010:ref_tracker_free (lib/ref_tracker.c:228 (discriminator 1))
> ...
> Call Trace:
> <IRQ>
> __sk_destruct (./include/net/net_namespace.h:353 net/core/sock.c:2204)
> rcu_core (./arch/x86/include/asm/preempt.h:26 kernel/rcu/tree.c:2165 kernel/rcu/tree.c:2433)
> __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:554)
> irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632 kernel/softirq.c:644)
> sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1076 (discriminator 14))
> </IRQ>
>
> Reported-by: syzkaller <syzkaller@...glegroups.com>
> Suggested-by: Eric Dumazet <edumazet@...gle.com>
> Fixes: 467fa15356ac ("RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns.")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.com>
Eric, the patches LGTM, and I can see your suggested-by tag, but better
safe then sorry: could you please confirm it is ok for you too?
Thanks,
Paolo
Powered by blists - more mailing lists