[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89i+WshtNwNSALCpbQbZFWN41xP85+c8GdHX2DabzQzx+6A@mail.gmail.com>
Date: Mon, 18 Sep 2023 10:05:09 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: George Guo <guodongtai@...inos.cn>, Florian Westphal <fw@...len.de>
Cc: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
dsahern@...nel.org, netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1] tcp: enhancing timestamps random algo to address
issues arising from NAT mapping
On Mon, Sep 18, 2023 at 3:46 AM George Guo <guodongtai@...inos.cn> wrote:
>
> Tsval=tsoffset+local_clock, here tsoffset is randomized with saddr and daddr parameters in func
> secure_tcp_ts_off. Most of time it is OK except for NAT mapping to the same port and daddr.
> Consider the following scenario:
> ns1: ns2:
> +-----------+ +-----------+
> | | | |
> | | | |
> | | | |
> | veth1 | | vethb |
> |192.168.1.1| |192.168.1.2|
> +----+------+ +-----+-----+
> | |
> | |
> | br0:192.168.1.254 |
> +----------+----------+
> veth0 | vetha
> 192.168.1.3 | 192.168.1.4
> |
> nat(192.168.1.x -->172.30.60.199)
> |
> V
> eth0
> 172.30.60.199
> |
> |
> +----> ... ... ---->server: 172.30.60.191
>
> Let's say ns1 (192.168.1.1) generates a timestamp ts1, and ns2 (192.168.1.2) generates a timestamp
> ts2, with ts1 > ts2.
>
> If ns1 initiates a connection to a server, and then the server actively closes the connection,
> entering the TIME_WAIT state, and ns2 attempts to connect to the server while port reuse is in
> progress, due to the presence of NAT, the server sees both connections as originating from the
> same IP address (e.g., 172.30.60.199) and port. However, since ts2 is smaller than ts1, the server
> will respond with the acknowledgment (ACK) for the fourth handshake.
>
> SERVER CLIENT
>
> 1. ESTABLISHED ESTABLISHED
>
> (Close)
> 2. FIN-WAIT-1 --> <SEQ=100><ACK=300><TSval=20><CTL=FIN,ACK> --> CLOSE-WAIT
>
> 3. FIN-WAIT-2 <-- <SEQ=300><ACK=101><TSval=40><CTL=ACK> <-- CLOSE-WAIT
>
> (Close)
> 4. TIME-WAIT <-- <SEQ=300><ACK=101><TSval=41><CTL=FIN,ACK> <-- LAST-ACK
>
> 5. TIME-WAIT --> <SEQ=101><ACK=301><TSval=25><CTL=ACK> --> CLOSED
>
> - - - - - - - - - - - - - port reused - - - - - - - - - - - - - - -
>
> 5.1. TIME-WAIT <-- <SEQ=255><TSval=30><CTL=SYN> <-- SYN-SENT
>
> 5.2. TIME-WAIT --> <SEQ=101><ACK=301><TSval=35><CTL=ACK> --> SYN-SENT
>
> 5.3. CLOSED <-- <SEQ=301><CTL=RST> <-- SYN-SENT
>
> 6. SYN-RECV <-- <SEQ=255><TSval=34><CTL=SYN> <-- SYN-SENT
>
> 7. SYN-RECV --> <SEQ=400><ACK=301><TSval=40><CTL=SYN,ACK> --> ESTABLISHED
>
> 1. ESTABLISH <-- <SEQ=301><ACK=401><TSval=55><CTL=ACK> <-- ESTABLISHED
>
> This enhancement uses sport and daddr rather than saddr and daddr, which keep the timestamp
> monotonically increasing in the situation described above. Then the port reuse is like this:
>
> SERVER CLIENT
>
> 1. ESTABLISHED ESTABLISHED
>
> (Close)
> 2. FIN-WAIT-1 --> <SEQ=100><ACK=300><TSval=20><CTL=FIN,ACK> --> CLOSE-WAIT
>
> 3. FIN-WAIT-2 <-- <SEQ=300><ACK=101><TSval=40><CTL=ACK> <-- CLOSE-WAIT
>
> (Close)
> 4. TIME-WAIT <-- <SEQ=300><ACK=101><TSval=41><CTL=FIN,ACK> <-- LAST-ACK
>
> 5. TIME-WAIT --> <SEQ=101><ACK=301><TSval=25><CTL=ACK> --> CLOSED
>
> - - - - - - - - - - - - - port reused - - - - - - - - - - - - - - -
>
> 5.1. TIME-WAIT <-- <SEQ=300><TSval=50><CTL=SYN> <-- SYN-SENT
>
> 6. SYN-RECV --> <SEQ=400><ACK=301><TSval=40><CTL=SYN,ACK> --> ESTABLISHED
>
> 1. ESTABLISH <-- <SEQ=301><ACK=401><TSval=55><CTL=ACK> <-- ESTABLISHED
>
> The enhancement lets port reused more efficiently.
>
> Signed-off-by: George Guo <guodongtai@...inos.cn>
>
CC Florian
I do not think we can 'fix' tcp timestamp vs NAT.
Unless the NAT device makes sure a port is dedicated for a peer,
and/or the NAT rewrites TS values
(which would be bad).
I personally prefer seeing the same timestamps from A to B regardless
of ports, it helps detect various issues.
Also, you seem to forget IPv6.
Powered by blists - more mailing lists