netdev - Re: [PATCH v6] tcp: fix race condition when creating child sockets from syncookies

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <305d00e2-c3db-3621-33a6-c1377b1aca63@gmail.com>
Date:   Tue, 17 Nov 2020 10:55:52 +0100
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Ricardo Dias <rdias@...glestore.com>, davem@...emloft.net,
        kuba@...nel.org, kuznet@....inr.ac.ru, yoshfuji@...ux-ipv6.org,
        edumazet@...gle.com
Cc:     netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6] tcp: fix race condition when creating child sockets
 from syncookies



On 11/17/20 8:29 AM, Ricardo Dias wrote:
> When the TCP stack is in SYN flood mode, the server child socket is
> created from the SYN cookie received in a TCP packet with the ACK flag
> set.
> 

...

>  
> @@ -1374,6 +1381,13 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
>  				skb_set_owner_r(newnp->pktoptions, newsk);
>  			}
>  		}
> +	} else {
> +		if (!req_unhash && esk) {
> +			/* This code path should only be executed in the
> +			 * syncookie case only
> +			 */
> +			newsk = esk;

What happens to the socket we just created and was stored in @newsk ?

We are going to leak kernel memory quite fast, if we just forget about it.

> +		}
>  	}
>  
>  	return newsk;
> 

Anyway, I have tested your patch (see below) and had many errors in 

[  669.578597] cleanup rbuf bug: copied 1A5AD52D seq 1A5AD52D rcvnxt 1A5AD52D
[  669.578604] WARNING: CPU: 30 PID: 13267 at net/ipv4/tcp.c:1553 tcp_cleanup_rbuf+0x48/0x100
[  669.578604] Modules linked in: vfat fat w1_therm wire i2c_mux_pca954x i2c_mux cdc_acm ehci_pci ehci_hcd mlx4_en mlx4_ib ib_uverbs ib_core mlx4_core
[  669.578610] CPU: 30 PID: 13267 Comm: ld-2.15.so.orig Tainted: G S      W         5.10.0-smp-DEV #394
[  669.578611] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.60.0 10/02/2019
[  669.578612] RIP: 0010:tcp_cleanup_rbuf+0x48/0x100
[  669.578612] Code: f3 48 39 d0 74 26 48 85 c0 74 21 8b 50 2c 8b b7 7c 05 00 00 39 d6 78 14 8b 8f 78 05 00 00 48 c7 c7 08 8a 57 a5 e8 1d b6 10 00 <0f> 0b 41 0f b6 84 24 c8 04 00 00 a8 01 74 2a 41 0f b7 8c 24 de 04
[  669.578613] RSP: 0018:ffff893bd898fca8 EFLAGS: 00010286
[  669.578614] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000ffff7fff
[  669.578615] RDX: 00000000ffff7fff RSI: 00000000ffffffea RDI: 0000000000000000
[  669.578615] RBP: ffff893bd898fcb8 R08: ffffffffa5bed108 R09: 0000000000027ffb
[  669.578616] R10: 00000000ffff8000 R11: 3fffffffffffffff R12: ffff891cd3530140
[  669.578617] R13: ffff891cd3530140 R14: 0000000000000000 R15: 0000000000000000
[  669.578618] FS:  00007fb7908ea700(0000) GS:ffff893affc80000(0000) knlGS:0000000000000000
[  669.578619] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  669.578620] CR2: 000034fedd58981c CR3: 00000020d87b2001 CR4: 00000000001706e0
[  669.578620] Call Trace:
[  669.578621]  tcp_recvmsg+0x21f/0xac0
[  669.578624]  inet6_recvmsg+0x5f/0x120
[  669.578625]  ? add_wait_queue_exclusive+0x80/0x80
[  669.578627]  sock_recvmsg+0x54/0x80
[  669.578628]  __sys_recvfrom+0x19e/0x1d0
[  669.578630]  ? __fput+0x11c/0x240
[  669.578632]  __x64_sys_recvfrom+0x29/0x30
[  669.578634]  do_syscall_64+0x38/0x50
[  669.578635]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

I have used following method on DUT host, and patch [1] in get_rps_cpus()

DUT has 48 hyper threads, I force RPS to spread packets on all of them.

for q in {0..15}; do echo ffff,ffffffff >/sys/class/net/eth0/queues/rx-$q/rps_cpus; done
echo 2 >/proc/sys/net/ipv4/tcp_syncookies
netserver

Then from another host : 

for i in {1..20}
do
 super_netperf 200 -H DUT -t TCP_CRR -l2 -- -p0
done

After the test, I had more than 100,000 TCPv6 sockets that were mem leaked.


[1]
diff --git a/net/core/dev.c b/net/core/dev.c
index 60d325bda0d7b4a1ecb7bf7b3352d58bed8b96a2..ae99206bd05126eef8cf4490bb197a3c781012ed 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4344,7 +4344,7 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb,
                goto done;
 
        skb_reset_network_header(skb);
-       hash = skb_get_hash(skb);
+       hash = prandom_u32();
        if (!hash)
                goto done;