[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <305d00e2-c3db-3621-33a6-c1377b1aca63@gmail.com>
Date: Tue, 17 Nov 2020 10:55:52 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Ricardo Dias <rdias@...glestore.com>, davem@...emloft.net,
kuba@...nel.org, kuznet@....inr.ac.ru, yoshfuji@...ux-ipv6.org,
edumazet@...gle.com
Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6] tcp: fix race condition when creating child sockets
from syncookies
On 11/17/20 8:29 AM, Ricardo Dias wrote:
> When the TCP stack is in SYN flood mode, the server child socket is
> created from the SYN cookie received in a TCP packet with the ACK flag
> set.
>
...
>
> @@ -1374,6 +1381,13 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
> skb_set_owner_r(newnp->pktoptions, newsk);
> }
> }
> + } else {
> + if (!req_unhash && esk) {
> + /* This code path should only be executed in the
> + * syncookie case only
> + */
> + newsk = esk;
What happens to the socket we just created and was stored in @newsk ?
We are going to leak kernel memory quite fast, if we just forget about it.
> + }
> }
>
> return newsk;
>
Anyway, I have tested your patch (see below) and had many errors in
[ 669.578597] cleanup rbuf bug: copied 1A5AD52D seq 1A5AD52D rcvnxt 1A5AD52D
[ 669.578604] WARNING: CPU: 30 PID: 13267 at net/ipv4/tcp.c:1553 tcp_cleanup_rbuf+0x48/0x100
[ 669.578604] Modules linked in: vfat fat w1_therm wire i2c_mux_pca954x i2c_mux cdc_acm ehci_pci ehci_hcd mlx4_en mlx4_ib ib_uverbs ib_core mlx4_core
[ 669.578610] CPU: 30 PID: 13267 Comm: ld-2.15.so.orig Tainted: G S W 5.10.0-smp-DEV #394
[ 669.578611] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.60.0 10/02/2019
[ 669.578612] RIP: 0010:tcp_cleanup_rbuf+0x48/0x100
[ 669.578612] Code: f3 48 39 d0 74 26 48 85 c0 74 21 8b 50 2c 8b b7 7c 05 00 00 39 d6 78 14 8b 8f 78 05 00 00 48 c7 c7 08 8a 57 a5 e8 1d b6 10 00 <0f> 0b 41 0f b6 84 24 c8 04 00 00 a8 01 74 2a 41 0f b7 8c 24 de 04
[ 669.578613] RSP: 0018:ffff893bd898fca8 EFLAGS: 00010286
[ 669.578614] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000ffff7fff
[ 669.578615] RDX: 00000000ffff7fff RSI: 00000000ffffffea RDI: 0000000000000000
[ 669.578615] RBP: ffff893bd898fcb8 R08: ffffffffa5bed108 R09: 0000000000027ffb
[ 669.578616] R10: 00000000ffff8000 R11: 3fffffffffffffff R12: ffff891cd3530140
[ 669.578617] R13: ffff891cd3530140 R14: 0000000000000000 R15: 0000000000000000
[ 669.578618] FS: 00007fb7908ea700(0000) GS:ffff893affc80000(0000) knlGS:0000000000000000
[ 669.578619] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 669.578620] CR2: 000034fedd58981c CR3: 00000020d87b2001 CR4: 00000000001706e0
[ 669.578620] Call Trace:
[ 669.578621] tcp_recvmsg+0x21f/0xac0
[ 669.578624] inet6_recvmsg+0x5f/0x120
[ 669.578625] ? add_wait_queue_exclusive+0x80/0x80
[ 669.578627] sock_recvmsg+0x54/0x80
[ 669.578628] __sys_recvfrom+0x19e/0x1d0
[ 669.578630] ? __fput+0x11c/0x240
[ 669.578632] __x64_sys_recvfrom+0x29/0x30
[ 669.578634] do_syscall_64+0x38/0x50
[ 669.578635] entry_SYSCALL_64_after_hwframe+0x44/0xa9
I have used following method on DUT host, and patch [1] in get_rps_cpus()
DUT has 48 hyper threads, I force RPS to spread packets on all of them.
for q in {0..15}; do echo ffff,ffffffff >/sys/class/net/eth0/queues/rx-$q/rps_cpus; done
echo 2 >/proc/sys/net/ipv4/tcp_syncookies
netserver
Then from another host :
for i in {1..20}
do
super_netperf 200 -H DUT -t TCP_CRR -l2 -- -p0
done
After the test, I had more than 100,000 TCPv6 sockets that were mem leaked.
[1]
diff --git a/net/core/dev.c b/net/core/dev.c
index 60d325bda0d7b4a1ecb7bf7b3352d58bed8b96a2..ae99206bd05126eef8cf4490bb197a3c781012ed 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4344,7 +4344,7 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb,
goto done;
skb_reset_network_header(skb);
- hash = skb_get_hash(skb);
+ hash = prandom_u32();
if (!hash)
goto done;
Powered by blists - more mailing lists