lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACSApvYckruRkaSb-9j_J+DsNpL9csPahHgkAZ3u2_95u4RHng@mail.gmail.com>
Date:   Wed, 1 Mar 2017 13:06:51 -0800
From:   Soheil Hassas Yeganeh <soheil@...gle.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     Eric Dumazet <edumazet@...gle.com>,
        David Miller <davem@...emloft.net>,
        Andrey Konovalov <andreyknvl@...gle.com>,
        netdev <netdev@...r.kernel.org>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Alexei Starovoitov <ast@...nel.org>
Subject: Re: [PATCH net] tcp/dccp: block BH for SYN processing

On Wed, Mar 1, 2017 at 8:39 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> From: Eric Dumazet <edumazet@...gle.com>
>
> SYN processing really was meant to be handled from BH.
>
> When I got rid of BH blocking while processing socket backlog
> in commit 5413d1babe8f ("net: do not block BH while processing socket
> backlog"), I forgot that a malicious user could transition to TCP_LISTEN
> from a state that allowed (SYN) packets to be parked in the socket
> backlog while socket is owned by the thread doing the listen() call.
>
> Sure enough syzkaller found this and reported the bug ;)
>
>
> =================================
> [ INFO: inconsistent lock state ]
> 4.10.0+ #60 Not tainted
> ---------------------------------
> inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
> syz-executor0/5090 [HC0[0]:SC0[0]:HE1:SE1] takes:
>  (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
> [<ffffffff83a6a370>] spin_lock include/linux/spinlock.h:299 [inline]
>  (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
> [<ffffffff83a6a370>] inet_ehash_insert+0x240/0xad0
> net/ipv4/inet_hashtables.c:407
> {IN-SOFTIRQ-W} state was registered at:
>   mark_irqflags kernel/locking/lockdep.c:2923 [inline]
>   __lock_acquire+0xbcf/0x3270 kernel/locking/lockdep.c:3295
>   lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
>   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>   _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
>   spin_lock include/linux/spinlock.h:299 [inline]
>   inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
>   reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
>   inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
>   tcp_conn_request+0x25cc/0x3310 net/ipv4/tcp_input.c:6399
>   tcp_v4_conn_request+0x157/0x220 net/ipv4/tcp_ipv4.c:1262
>   tcp_rcv_state_process+0x802/0x4130 net/ipv4/tcp_input.c:5889
>   tcp_v4_do_rcv+0x56b/0x940 net/ipv4/tcp_ipv4.c:1433
>   tcp_v4_rcv+0x2e12/0x3210 net/ipv4/tcp_ipv4.c:1711
>   ip_local_deliver_finish+0x4ce/0xc40 net/ipv4/ip_input.c:216
>   NF_HOOK include/linux/netfilter.h:257 [inline]
>   ip_local_deliver+0x1ce/0x710 net/ipv4/ip_input.c:257
>   dst_input include/net/dst.h:492 [inline]
>   ip_rcv_finish+0xb1d/0x2110 net/ipv4/ip_input.c:396
>   NF_HOOK include/linux/netfilter.h:257 [inline]
>   ip_rcv+0xd90/0x19c0 net/ipv4/ip_input.c:487
>   __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4179
>   __netif_receive_skb+0x2a/0x170 net/core/dev.c:4217
>   netif_receive_skb_internal+0x1d6/0x430 net/core/dev.c:4245
>   napi_skb_finish net/core/dev.c:4602 [inline]
>   napi_gro_receive+0x4e6/0x680 net/core/dev.c:4636
>   e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4033 [inline]
>   e1000_clean_rx_irq+0x5e0/0x1490
> drivers/net/ethernet/intel/e1000/e1000_main.c:4489
>   e1000_clean+0xb9a/0x2910 drivers/net/ethernet/intel/e1000/e1000_main.c:3834
>   napi_poll net/core/dev.c:5171 [inline]
>   net_rx_action+0xe70/0x1900 net/core/dev.c:5236
>   __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
>   invoke_softirq kernel/softirq.c:364 [inline]
>   irq_exit+0x19e/0x1d0 kernel/softirq.c:405
>   exiting_irq arch/x86/include/asm/apic.h:658 [inline]
>   do_IRQ+0x81/0x1a0 arch/x86/kernel/irq.c:250
>   ret_from_intr+0x0/0x20
>   native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
>   arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>   default_idle+0x8f/0x410 arch/x86/kernel/process.c:271
>   arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:262
>   default_idle_call+0x36/0x60 kernel/sched/idle.c:96
>   cpuidle_idle_call kernel/sched/idle.c:154 [inline]
>   do_idle+0x348/0x440 kernel/sched/idle.c:243
>   cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:345
>   start_secondary+0x344/0x440 arch/x86/kernel/smpboot.c:272
>   verify_cpu+0x0/0xfc
> irq event stamp: 1741
> hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
> __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160
> [inline]
> hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
> _raw_spin_unlock_irqrestore+0xf7/0x1a0 kernel/locking/spinlock.c:191
> hardirqs last disabled at (1740): [<ffffffff84d4a732>]
> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
> hardirqs last disabled at (1740): [<ffffffff84d4a732>]
> _raw_spin_lock_irqsave+0xa2/0x110 kernel/locking/spinlock.c:159
> softirqs last  enabled at (1738): [<ffffffff84d4deff>]
> __do_softirq+0x7cf/0xb7d kernel/softirq.c:310
> softirqs last disabled at (1571): [<ffffffff84d4b92c>]
> do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(&(&hashinfo->ehash_locks[i])->rlock);
>   <Interrupt>
>     lock(&(&hashinfo->ehash_locks[i])->rlock);
>
>  *** DEADLOCK ***
>
> 1 lock held by syz-executor0/5090:
>  #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>] lock_sock
> include/net/sock.h:1460 [inline]
>  #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>]
> sock_setsockopt+0x233/0x1e40 net/core/sock.c:683
>
> stack backtrace:
> CPU: 1 PID: 5090 Comm: syz-executor0 Not tainted 4.10.0+ #60
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:15 [inline]
>  dump_stack+0x292/0x398 lib/dump_stack.c:51
>  print_usage_bug+0x3ef/0x450 kernel/locking/lockdep.c:2387
>  valid_state kernel/locking/lockdep.c:2400 [inline]
>  mark_lock_irq kernel/locking/lockdep.c:2602 [inline]
>  mark_lock+0xf30/0x1410 kernel/locking/lockdep.c:3065
>  mark_irqflags kernel/locking/lockdep.c:2941 [inline]
>  __lock_acquire+0x6dc/0x3270 kernel/locking/lockdep.c:3295
>  lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
>  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>  _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
>  spin_lock include/linux/spinlock.h:299 [inline]
>  inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
>  reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
>  inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
>  dccp_v6_conn_request+0xada/0x11b0 net/dccp/ipv6.c:380
>  dccp_rcv_state_process+0x51e/0x1660 net/dccp/input.c:606
>  dccp_v6_do_rcv+0x213/0x350 net/dccp/ipv6.c:632
>  sk_backlog_rcv include/net/sock.h:896 [inline]
>  __release_sock+0x127/0x3a0 net/core/sock.c:2052
>  release_sock+0xa5/0x2b0 net/core/sock.c:2539
>  sock_setsockopt+0x60f/0x1e40 net/core/sock.c:1016
>  SYSC_setsockopt net/socket.c:1782 [inline]
>  SyS_setsockopt+0x2fb/0x3a0 net/socket.c:1765
>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> RIP: 0033:0x4458b9
> RSP: 002b:00007fe8b26c2b58 EFLAGS: 00000292 ORIG_RAX: 0000000000000036
> RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00000000004458b9
> RDX: 000000000000001a RSI: 0000000000000001 RDI: 0000000000000006
> RBP: 00000000006e2110 R08: 0000000000000010 R09: 0000000000000000
> R10: 00000000208c3000 R11: 0000000000000292 R12: 0000000000708000
> R13: 0000000020000000 R14: 0000000000001000 R15: 0000000000000000
>
> Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Reported-by: Andrey Konovalov <andreyknvl@...gle.com>

Acked-by: Soheil Hassas Yeganeh <soheil@...gle.com>

> ---
>  net/dccp/input.c     |   10 ++++++++--
>  net/ipv4/tcp_input.c |   10 ++++++++--
>  2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/net/dccp/input.c b/net/dccp/input.c
> index 8fedc2d497709b3dea9202894f45bf5cab043361..4a05d78768502df69275b4f91cb03bb2ada9f4c3 100644
> --- a/net/dccp/input.c
> +++ b/net/dccp/input.c
> @@ -577,6 +577,7 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
>         struct dccp_sock *dp = dccp_sk(sk);
>         struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb);
>         const int old_state = sk->sk_state;
> +       bool acceptable;
>         int queued = 0;
>
>         /*
> @@ -603,8 +604,13 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
>          */
>         if (sk->sk_state == DCCP_LISTEN) {
>                 if (dh->dccph_type == DCCP_PKT_REQUEST) {
> -                       if (inet_csk(sk)->icsk_af_ops->conn_request(sk,
> -                                                                   skb) < 0)
> +                       /* It is possible that we process SYN packets from backlog,
> +                        * so we need to make sure to disable BH right there.
> +                        */
> +                       local_bh_disable();
> +                       acceptable = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb) >= 0;
> +                       local_bh_enable();
> +                       if (!acceptable)
>                                 return 1;
>                         consume_skb(skb);
>                         return 0;
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 2c0ff327b6dfe6919f22bf52687816e19c2c0444..39c393cc0fd3c17130cd5d8d8b37f31ad3aeafd9 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -5886,9 +5886,15 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
>                 if (th->syn) {
>                         if (th->fin)
>                                 goto discard;
> -                       if (icsk->icsk_af_ops->conn_request(sk, skb) < 0)
> -                               return 1;
> +                       /* It is possible that we process SYN packets from backlog,
> +                        * so we need to make sure to disable BH right there.
> +                        */
> +                       local_bh_disable();
> +                       acceptable = icsk->icsk_af_ops->conn_request(sk, skb) >= 0;
> +                       local_bh_enable();
>
> +                       if (!acceptable)
> +                               return 1;
>                         consume_skb(skb);
>                         return 0;
>                 }
>
>

Nice fix! Thanks Eric!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ