lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 19 Feb 2019 17:00:41 +0100
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Jakub Sitnicki <jakub@...udflare.com>
Cc:     netdev@...r.kernel.org, John Fastabend <john.fastabend@...il.com>,
        Marek Majkowski <marek@...udflare.com>
Subject: Re: [PATCH net] sk_msg: Keep reference on socket file while psock
 lives

Hi Jakub,

On 02/11/2019 10:09 AM, Jakub Sitnicki wrote:
> Backlog work for psock (sk_psock_backlog) might sleep while waiting for
> memory to free up when sending packets. While sleeping, socket can
> disappear from under our feet together with its wait queue because the
> userspace has closed it.
> 
> This breaks an assumption in sk_stream_wait_memory, which expects the
> wait queue to be still there when it wakes up resulting in a
> use-after-free:
> 
> ==================================================================
> BUG: KASAN: use-after-free in remove_wait_queue+0x31/0x70
> Write of size 8 at addr ffff888069a0c4e8 by task kworker/0:2/110
> 
> CPU: 0 PID: 110 Comm: kworker/0:2 Not tainted 5.0.0-rc2-00335-g28f9d1a3d4fe-dirty #14
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
> Workqueue: events sk_psock_backlog
> Call Trace:
>  print_address_description+0x6e/0x2b0
>  ? remove_wait_queue+0x31/0x70
>  kasan_report+0xfd/0x177
>  ? remove_wait_queue+0x31/0x70
>  ? remove_wait_queue+0x31/0x70
>  remove_wait_queue+0x31/0x70
>  sk_stream_wait_memory+0x4dd/0x5f0
>  ? sk_stream_wait_close+0x1b0/0x1b0
>  ? wait_woken+0xc0/0xc0
>  ? tcp_current_mss+0xc5/0x110
>  tcp_sendmsg_locked+0x634/0x15d0
>  ? tcp_set_state+0x2e0/0x2e0
>  ? __kasan_slab_free+0x1d1/0x230
>  ? kmem_cache_free+0x70/0x140
>  ? sk_psock_backlog+0x40c/0x4b0
>  ? process_one_work+0x40b/0x660
>  ? worker_thread+0x82/0x680
>  ? kthread+0x1b9/0x1e0
>  ? ret_from_fork+0x1f/0x30
>  ? check_preempt_curr+0xaf/0x130
>  ? iov_iter_kvec+0x5f/0x70
>  ? kernel_sendmsg_locked+0xa0/0xe0
>  skb_send_sock_locked+0x273/0x3c0
>  ? skb_splice_bits+0x180/0x180
>  ? start_thread+0xe0/0xe0
>  ? update_min_vruntime.constprop.27+0x88/0xc0
>  sk_psock_backlog+0xb3/0x4b0
>  ? strscpy+0xbf/0x1e0
>  process_one_work+0x40b/0x660
>  worker_thread+0x82/0x680
>  ? process_one_work+0x660/0x660
>  kthread+0x1b9/0x1e0
>  ? __kthread_create_on_node+0x250/0x250
>  ret_from_fork+0x1f/0x30
> 
> Allocated by task 109:
>  sock_alloc_inode+0x54/0x120
>  alloc_inode+0x28/0xb0
>  new_inode_pseudo+0x7/0x60
>  sock_alloc+0x21/0xe0
>  __sys_accept4+0xc2/0x330
>  __x64_sys_accept+0x3b/0x50
>  do_syscall_64+0xb2/0x3e0
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Freed by task 7:
>  kfree+0x7f/0x140
>  rcu_process_callbacks+0xe0/0x100
>  __do_softirq+0xe5/0x29a
> 
> The buggy address belongs to the object at ffff888069a0c4e0
>  which belongs to the cache kmalloc-64 of size 64
> The buggy address is located 8 bytes inside of
>  64-byte region [ffff888069a0c4e0, ffff888069a0c520)
> The buggy address belongs to the page:
> page:ffffea0001a68300 count:1 mapcount:0 mapping:ffff88806d4018c0 index:0x0
> flags: 0x4000000000000200(slab)
> raw: 4000000000000200 dead000000000100 dead000000000200 ffff88806d4018c0
> raw: 0000000000000000 00000000002a002a 00000001ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
> 
> Memory state around the buggy address:
>  ffff888069a0c380: fb fb fb fb fc fc fc fc fb fb fb fb fb fb fb fb
>  ffff888069a0c400: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
>> ffff888069a0c480: 00 00 00 00 00 00 00 00 fc fc fc fc fb fb fb fb
>                                                           ^
>  ffff888069a0c500: fb fb fb fb fc fc fc fc fb fb fb fb fb fb fb fb
>  ffff888069a0c580: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
> ==================================================================
> 
> Avoid it by keeping a reference to the socket file until the psock gets
> destroyed.
> 
> While at it, rearrange the order of reference grabbing and
> initialization to match the destructor in reverse.
> 
> Reported-by: Marek Majkowski <marek@...udflare.com>
> Link: https://lore.kernel.org/netdev/CAJPywTLwgXNEZ2dZVoa=udiZmtrWJ0q5SuBW64aYs0Y1khXX3A@mail.gmail.com
> Signed-off-by: Jakub Sitnicki <jakub@...udflare.com>
> ---
>  net/core/skmsg.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index 8c826603bf36..a38442b8580b 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -493,8 +493,13 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node)
>  	sk_psock_set_state(psock, SK_PSOCK_TX_ENABLED);
>  	refcount_set(&psock->refcnt, 1);
>  
> -	rcu_assign_sk_user_data(sk, psock);
> +	/* Hold on to socket wait queue. Backlog work waits on it for
> +	 * memory when sending. We must cancel work before socket wait
> +	 * queue can go away.
> +	 */
> +	get_file(sk->sk_socket->file);
>  	sock_hold(sk);
> +	rcu_assign_sk_user_data(sk, psock);
>  
>  	return psock;
>  }
> @@ -558,6 +563,7 @@ static void sk_psock_destroy_deferred(struct work_struct *gc)
>  	if (psock->sk_redir)
>  		sock_put(psock->sk_redir);
>  	sock_put(psock->sk);
> +	fput(psock->sk->sk_socket->file);

Thanks for the report (and sorry for the late reply). I think holding ref on
the struct file just so we keep it alive till deferred destruction might be
papering over the actual underlying bug. Nothing obvious pops out from staring
at the code right now; as a reproducer to run, did you use the prog in the link
above and hit it after your strparser fix?

Thanks again,
Daniel

Powered by blists - more mailing lists