netdev - Re: [PATCH] tcp: Add an extra check for consecutive failed keepalive probes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CADVnQy=Uy+UxYivkUY1JZ4+c2rDD74VY8=vxmxf=NJxWcXa69Q@mail.gmail.com>
Date: Thu, 9 Jan 2025 11:31:55 -0500
From: Neal Cardwell <ncardwell@...gle.com>
To: Lizhe <sensor1010@....com>
Cc: edumazet@...gle.com, davem@...emloft.net, dsahern@...nel.org, 
	kuba@...nel.org, pabeni@...hat.com, horms@...nel.org, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] tcp: Add an extra check for consecutive failed keepalive probes

On Thu, Jan 9, 2025 at 11:21 AM Lizhe <sensor1010@....com> wrote:
>
> Add an additional check to handle situations where consecutive
> keepalive probe packets are sent without receiving a response.
>
> Signed-off-by: Lizhe <sensor1010@....com>
> ---
>  net/ipv4/tcp_timer.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index b412ed88ccd9..5a5dee8cd6d3 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -828,6 +828,12 @@ static void tcp_keepalive_timer (struct timer_list *t)
>                 }
>                 if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
>                         icsk->icsk_probes_out++;
> +                       if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
> +                               tcp_send_active_reset(sk, GFP_ATOMIC,
> +                                               SK_RST_REASON_TCP_KEEPALIVE_TIMEOUT);
> +                               tcp_write_err(sk);
> +                               goto out;
> +                       }
>                         elapsed = keepalive_intvl_when(tp);
>                 } else {
>                         /* If keepalive was lost due to local congestion,
> --

Can you please explain the exact motivation for your patch, ideally
providing either a tcpdump trace or packetdrill test to document the
scenario you are concerned about?

The Linux TCP keepalive logic in tcp_keepalive_timer() already
includes logic (a few lines above the spot you propose to patch) that
ensures that a connection will be closed with ETIMEDOUT if consecutive
keepalive probes fail:

                if ((user_timeout != 0 &&
                    elapsed >= msecs_to_jiffies(user_timeout) &&
                    icsk->icsk_probes_out > 0) ||
                    (user_timeout == 0 &&
                    icsk->icsk_probes_out >= keepalive_probes(tp))) {
                        tcp_send_active_reset(sk, GFP_ATOMIC,

SK_RST_REASON_TCP_KEEPALIVE_TIMEOUT);
                        tcp_write_err(sk);
                        goto out;
                }
                if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
                        icsk->icsk_probes_out++;
                        elapsed = keepalive_intvl_when(tp);

AFAICT your patch nearly duplicates the existing logic, but changes
the application-visible behavior to close the connection after one
fewer timer expiration, thus breaking the semantics of the
net.ipv4.tcp_keepalive_probes.

neal

---

ps: For reference, here is a packetdrill test we use to test this
functionality; this passes on recent Linux kernels:

// Test TCP keepalive behavior without TCP timestamps enabled.

`../common/defaults.sh`

// Create a socket.
    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0

   +0 bind(3, ..., ...) = 0
   +0 listen(3, 1) = 0

// Establish a connection.
   +0 < S 0:0(0) win 20000 <mss 1000,nop,nop,sackOK,nop,wscale 8>
   +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
  +.1 < . 1:1(0) ack 1 win 20000
   +0 accept(3, ..., ...) = 4

// Verify keepalives are disabled by default.
   +0 getsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [0], [4]) = 0
// Enable keepalives:
   +0 setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0

// Verify default TCP_KEEPIDLE is 7200, from net.ipv4.tcp_keepalive_time=7200:
   +0 getsockopt(4, SOL_TCP, TCP_KEEPIDLE, [7200], [4]) = 0
// Start sending keepalive probes after 3 seconds of idle
   +0 setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [3], 4) = 0

// Verify default TCP_KEEPINTVL is 75, from net.ipv4.tcp_keepalive_intvl=75:
   +0 getsockopt(4, SOL_TCP, TCP_KEEPINTVL, [75], [4]) = 0
// Send keepalive probes every 2 seconds.
   +0 setsockopt(4, SOL_TCP, TCP_KEEPINTVL, [2], 4) = 0

// Verify default TCP_KEEPCNT is 9, from net.ipv4.tcp_keepalive_probes=9:
   +0 getsockopt(4, SOL_TCP, TCP_KEEPCNT, [9], [4]) = 0
// Send 4 keepalive probes before giving up.
   +0 setsockopt(4, SOL_TCP, TCP_KEEPCNT, [4], 4) = 0

// Set up an epoll operation to verify that connections terminated by failed
// keepalives will wake up blocked epoll waiters with EPOLLERR|EPOLLHUP:
   +0 epoll_create(1) = 5
   +0 epoll_ctl(5, EPOLL_CTL_ADD, 4, {events=EPOLLERR, fd=4}) = 0
   +0...11 epoll_wait(5, {events=EPOLLERR|EPOLLHUP, fd=4}, 1, 15000) = 1

// Verify keepalive behavior looks correct, given the parameters above:

// Start sending keepalive probes after 3 seconds of idle.
   +3 > . 0:0(0) ack 1
// Send keepalive probes every 2 seconds.
   +2 > . 0:0(0) ack 1
   +2 > . 0:0(0) ack 1
   +2 > . 0:0(0) ack 1
   +2 > R. 1:1(0) ack 1
// Sent 4 keepalive probes and then gave up and reset the connection.

// Verify that we get the expected error when we try to use the socket:
   +0 read(4, ..., 1000) = -1 ETIMEDOUT (Connection timed out)