lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZGzfzEs-vJcZAySI@zx2c4.com>
Date:   Tue, 23 May 2023 17:46:20 +0200
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     syzbot <syzbot+c2775460db0e1c70018e@...kaller.appspotmail.com>,
        edumazet@...gle.com, kuba@...nel.org, netdev@...r.kernel.org,
        syzkaller-bugs@...glegroups.com
Cc:     davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        pabeni@...hat.com, syzkaller-bugs@...glegroups.com,
        wireguard@...ts.zx2c4.com, jann@...jh.net
Subject: Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in
 enqueue_timer

Hey Syzkaller & Netdev folks,

I've been looking at this a bit and am slightly puzzled. At first I saw
this:

>  enqueue_timer+0xad/0x560 kernel/time/timer.c:605
>  internal_add_timer kernel/time/timer.c:634 [inline]
>  __mod_timer+0xa76/0xf40 kernel/time/timer.c:1131
>  mod_peer_timer+0x158/0x220 drivers/net/wireguard/timers.c:37
>  wg_packet_consume_data_done drivers/net/wireguard/receive.c:354 [inline]
>  wg_packet_rx_poll+0xd9e/0x2250 drivers/net/wireguard/receive.c:474

And I thought - darn, it's a bug where a struct wg_peer's timer is
modified -- in this case, timer_persistent_keepalive by way of
wg_timers_any_authenticated_packet_traversal() -- after the peer object
has been freed. This fits most clearly the designated line
receive.c:354, and the subsequent 8 byte write when enqueuing the timer.

So I traced through the peer shutdown code in peer.c -- the
peer_make_dead() + peer_remove_after_dead() combo -- and made sure the
peer->is_dead RCU logic was correct. And I couldn't find a bug.

But then I looked further down at the syzbot report:

> Allocated by task 16792:
>  kvzalloc include/linux/slab.h:705 [inline]
>  alloc_netdev_mqs+0x89/0xf30 net/core/dev.c:10626
>  rtnl_create_link+0x2f7/0xc00 net/core/rtnetlink.c:3315

and

> Freed by task 41:
>  __kmem_cache_free+0x264/0x3c0 mm/slub.c:3799
>  device_release+0x95/0x1c0
>  kobject_cleanup lib/kobject.c:683 [inline]
>  kobject_release lib/kobject.c:714 [inline]
>  kref_put include/linux/kref.h:65 [inline]
>  kobject_put+0x228/0x470 lib/kobject.c:731
>  netdev_run_todo+0xe5a/0xf50 net/core/dev.c:10400

So that means the memory in question is actually the one that's
allocated and freed by the networking stack. Specifically, dev.c:10626
is allocating a struct net_device with a trailing struct wg_device (its
priv_data). However, wg_device does not have any struct timer_lists in
it, and I don't see how net_device's watchdog_timer would be related to
the stacktrace which is clearly operating over a wg_peer timer.

So what on earth is going on here?

Jason

PS - Jakub, I have some WG fixes queued up for you, but I wanted to have
some resolution with this first before sending a tranche.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ