[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <14224FE4-2425-464B-824B-EF74AAAE89BC@gmail.com>
Date: Thu, 20 Dec 2018 16:42:31 -0800
From: "Jonathan Lemon" <jonathan.lemon@...il.com>
To: "Eric Dumazet" <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
"Eric Dumazet" <eric.dumazet@...il.com>,
"Alexei Starovoitov" <ast@...nel.org>
Subject: Re: [PATCH net] tcp: fix a race in inet_diag_dump_icsk()
Acked-by: Jonathan Lemon <jonathan.lemon@...il.com>
On 20 Dec 2018, at 15:28, Eric Dumazet wrote:
> Alexei reported use after frees in inet_diag_dump_icsk() [1]
>
> Because we use refcount_set() when various sockets are setup and
> inserted into ehash, we also need to make sure inet_diag_dump_icsk()
> wont race with the refcount_set() operations.
>
> Jonathan Lemon sent a patch changing net_twsk_hashdance() but
> other spots would need risky changes.
>
> Instead, fix inet_diag_dump_icsk() as this bug came with
> linux-4.10 only.
>
> [1] Quoting Alexei :
>
> First something iterating over sockets finds already freed tw socket:
>
> refcount_t: increment on 0; use-after-free.
> WARNING: CPU: 2 PID: 2738 at lib/refcount.c:153 refcount_inc+0x26/0x30
> RIP: 0010:refcount_inc+0x26/0x30
> RSP: 0018:ffffc90004c8fbc0 EFLAGS: 00010282
> RAX: 000000000000002b RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff88085ee9d680 RSI: ffff88085ee954c8 RDI: ffff88085ee954c8
> RBP: ffff88010ecbd2c0 R08: 0000000000000000 R09: 000000000000174c
> R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff8806ba9bf210 R14: ffffffff82304600 R15: ffff88010ecbd328
> FS: 00007f81f5a7d700(0000) GS:ffff88085ee80000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f81e2a95000 CR3: 000000069b2eb006 CR4: 00000000003606e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> inet_diag_dump_icsk+0x2b3/0x4e0 [inet_diag] // sock_hold(sk); in
> net/ipv4/inet_diag.c:1002
> ? kmalloc_large_node+0x37/0x70
> ? __kmalloc_node_track_caller+0x1cb/0x260
> ? __alloc_skb+0x72/0x1b0
> ? __kmalloc_reserve.isra.40+0x2e/0x80
> __inet_diag_dump+0x3b/0x80 [inet_diag]
> netlink_dump+0x116/0x2a0
> netlink_recvmsg+0x205/0x3c0
> sock_read_iter+0x89/0xd0
> __vfs_read+0xf7/0x140
> vfs_read+0x8a/0x140
> SyS_read+0x3f/0xa0
> do_syscall_64+0x5a/0x100
>
> then a minute later twsk timer fires and hits two bad refcnts
> for this freed socket:
>
> refcount_t: decrement hit 0; leaking memory.
> WARNING: CPU: 31 PID: 0 at lib/refcount.c:228 refcount_dec+0x2e/0x40
> Modules linked in:
> RIP: 0010:refcount_dec+0x2e/0x40
> RSP: 0018:ffff88085f5c3ea8 EFLAGS: 00010296
> RAX: 000000000000002c RBX: ffff88010ecbd2c0 RCX: 000000000000083f
> RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000003f
> RBP: ffffc90003c77280 R08: 0000000000000000 R09: 00000000000017d3
> R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: ffffffff82ad2d80
> R13: ffffffff8182de00 R14: ffff88085f5c3ef8 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff88085f5c0000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fbe42685250 CR3: 0000000002209001 CR4: 00000000003606e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <IRQ>
> inet_twsk_kill+0x9d/0xc0 // inet_twsk_bind_unhash(tw, hashinfo);
> call_timer_fn+0x29/0x110
> run_timer_softirq+0x36b/0x3a0
>
> refcount_t: underflow; use-after-free.
> WARNING: CPU: 31 PID: 0 at lib/refcount.c:187
> refcount_sub_and_test+0x46/0x50
> RIP: 0010:refcount_sub_and_test+0x46/0x50
> RSP: 0018:ffff88085f5c3eb8 EFLAGS: 00010296
> RAX: 0000000000000026 RBX: ffff88010ecbd2c0 RCX: 000000000000083f
> RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000003f
> RBP: ffff88010ecbd358 R08: 0000000000000000 R09: 000000000000185b
> R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: ffff88010ecbd358
> R13: ffffffff8182de00 R14: ffff88085f5c3ef8 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff88085f5c0000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fbe42685250 CR3: 0000000002209001 CR4: 00000000003606e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <IRQ>
> inet_twsk_put+0x12/0x20 // inet_twsk_put(tw);
> call_timer_fn+0x29/0x110
> run_timer_softirq+0x36b/0x3a0
>
> Fixes: 67db3e4bfbc9 ("tcp: no longer hold ehash lock while calling
> tcp_get_info()")
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Reported-by: Alexei Starovoitov <ast@...nel.org>
> Cc: Jonathan Lemon <jonathan.lemon@...il.com>
> ---
> net/ipv4/inet_diag.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
> index
> 4e5bc4b2f14e6786ceb7d63e5902f8fc17819dfa..1a4e9ff02762ed757545da13de1ee352f38c867b
> 100644
> --- a/net/ipv4/inet_diag.c
> +++ b/net/ipv4/inet_diag.c
> @@ -998,7 +998,9 @@ void inet_diag_dump_icsk(struct inet_hashinfo
> *hashinfo, struct sk_buff *skb,
> if (!inet_diag_bc_sk(bc, sk))
> goto next_normal;
>
> - sock_hold(sk);
> + if (!refcount_inc_not_zero(&sk->sk_refcnt))
> + goto next_normal;
> +
> num_arr[accum] = num;
> sk_arr[accum] = sk;
> if (++accum == SKARR_SZ)
> --
> 2.20.1.415.g653613c723-goog
Powered by blists - more mailing lists