linux-kernel - Re: Re: [PATCH] tcp: Add an extra check for consecutive failed keepalive probes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CADVnQykJBub7Fd_dUoNWbROuSgd+KHOr4JpdBDc3DJVLgU2ULg@mail.gmail.com>
Date: Fri, 10 Jan 2025 11:28:06 -0500
From: Neal Cardwell <ncardwell@...gle.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: lizhe <sensor1010@....com>, davem@...emloft.net, dsahern@...nel.org, 
	kuba@...nel.org, pabeni@...hat.com, horms@...nel.org, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: Re: [PATCH] tcp: Add an extra check for consecutive failed
 keepalive probes

On Fri, Jan 10, 2025 at 11:22 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
>
>
> On Fri, Jan 10, 2025 at 4:58 PM lizhe <sensor1010@....com> wrote:
> >
> > Hi, Neal
> >
> >
> > If the TCP_USER_TIMEOUT option is not enabled, and attempts to send TCP keepalive probes continuously fail,
> >
> > then who limits the number of increments to icsk->icsk_probes_out?
> >
> >
> > Adding this code is feasible. If not added, the system would continuously send keepalive probes without any limit.
> >
> > If these probes continually fail, the process would persist indefinitely because there would be no measure in place to restrict the increments of icsk->icsk_probes_out++.
> >
> >
>
> I think you should provide a packetdrill test, as Neal suggested.
>
> If you write a packetdrill test, chances are very high you will see the code is currently fine.

Yes, indeed. :-)

AFAICT we don't even need a new packetdrill test... I provided a
"repeated keepalive failures" packetdrill test earlier in this thread.
If folks run that test, they should find that recent Linux kernels
correctly terminate a connection after the configured number of
keepalive probes fail.

If there is some corner case we are missing, then you might want to
start from that packetdrill test I pasted in, and try to demonstrate
your corner case that you think the code is missing.

thanks,
neal

> In any case, you have to provide a Fixes: tag for any bug fix for networking code,
> as explained in Documentation/process/maintainer-netdev.rst
>
>  Thank you.
>
> >
> > _Lizhe,
> >
> > thx
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > At 2025-01-10 00:31:55, "Neal Cardwell" <ncardwell@...gle.com> wrote:
> > >On Thu, Jan 9, 2025 at 11:21 AM Lizhe <sensor1010@....com> wrote:
> > >>
> > >> Add an additional check to handle situations where consecutive
> > >> keepalive probe packets are sent without receiving a response.
> > >>
> > >> Signed-off-by: Lizhe <sensor1010@....com>
> > >> ---
> > >>  net/ipv4/tcp_timer.c | 6 ++++++
> > >>  1 file changed, 6 insertions(+)
> > >>
> > >> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> > >> index b412ed88ccd9..5a5dee8cd6d3 100644
> > >> --- a/net/ipv4/tcp_timer.c
> > >> +++ b/net/ipv4/tcp_timer.c
> > >> @@ -828,6 +828,12 @@ static void tcp_keepalive_timer (struct timer_list *t)
> > >>                 }
> > >>                 if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
> > >>                         icsk->icsk_probes_out++;
> > >> +                       if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
> > >> +                               tcp_send_active_reset(sk, GFP_ATOMIC,
> > >> +                                               SK_RST_REASON_TCP_KEEPALIVE_TIMEOUT);
> > >> +                               tcp_write_err(sk);
> > >> +                               goto out;
> > >> +                       }
> > >>                         elapsed = keepalive_intvl_when(tp);
> > >>                 } else {
> > >>                         /* If keepalive was lost due to local congestion,
> > >> --
> > >
> > >Can you please explain the exact motivation for your patch, ideally
> > >providing either a tcpdump trace or packetdrill test to document the
> > >scenario you are concerned about?
> > >
> > >The Linux TCP keepalive logic in tcp_keepalive_timer() already
> > >includes logic (a few lines above the spot you propose to patch) that
> > >ensures that a connection will be closed with ETIMEDOUT if consecutive
> > >keepalive probes fail:
> > >
> > >                if ((user_timeout != 0 &&
> > >                    elapsed >= msecs_to_jiffies(user_timeout) &&
> > >                    icsk->icsk_probes_out > 0) ||
> > >                    (user_timeout == 0 &&
> > >                    icsk->icsk_probes_out >= keepalive_probes(tp))) {
> > >                        tcp_send_active_reset(sk, GFP_ATOMIC,
> > >
> > >SK_RST_REASON_TCP_KEEPALIVE_TIMEOUT);
> > >                        tcp_write_err(sk);
> > >                        goto out;
> > >                }
> > >                if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
> > >                        icsk->icsk_probes_out++;
> > >                        elapsed = keepalive_intvl_when(tp);
> > >
> > >AFAICT your patch nearly duplicates the existing logic, but changes
> > >the application-visible behavior to close the connection after one
> > >fewer timer expiration, thus breaking the semantics of the
> > >net.ipv4.tcp_keepalive_probes.
> > >
> > >neal
> > >
> > >---
> > >
> > >ps: For reference, here is a packetdrill test we use to test this
> > >functionality; this passes on recent Linux kernels:
> > >
> > >// Test TCP keepalive behavior without TCP timestamps enabled.
> > >
> > >`../common/defaults.sh`
> > >
> > >// Create a socket.
> > >    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
> > >   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> > >
> > >   +0 bind(3, ..., ...) = 0
> > >   +0 listen(3, 1) = 0
> > >
> > >// Establish a connection.
> > >   +0 < S 0:0(0) win 20000 <mss 1000,nop,nop,sackOK,nop,wscale 8>
> > >   +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
> > >  +.1 < . 1:1(0) ack 1 win 20000
> > >   +0 accept(3, ..., ...) = 4
> > >
> > >// Verify keepalives are disabled by default.
> > >   +0 getsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [0], [4]) = 0
> > >// Enable keepalives:
> > >   +0 setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
> > >
> > >// Verify default TCP_KEEPIDLE is 7200, from net.ipv4.tcp_keepalive_time=7200:
> > >   +0 getsockopt(4, SOL_TCP, TCP_KEEPIDLE, [7200], [4]) = 0
> > >// Start sending keepalive probes after 3 seconds of idle
> > >   +0 setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [3], 4) = 0
> > >
> > >// Verify default TCP_KEEPINTVL is 75, from net.ipv4.tcp_keepalive_intvl=75:
> > >   +0 getsockopt(4, SOL_TCP, TCP_KEEPINTVL, [75], [4]) = 0
> > >// Send keepalive probes every 2 seconds.
> > >   +0 setsockopt(4, SOL_TCP, TCP_KEEPINTVL, [2], 4) = 0
> > >
> > >// Verify default TCP_KEEPCNT is 9, from net.ipv4.tcp_keepalive_probes=9:
> > >   +0 getsockopt(4, SOL_TCP, TCP_KEEPCNT, [9], [4]) = 0
> > >// Send 4 keepalive probes before giving up.
> > >   +0 setsockopt(4, SOL_TCP, TCP_KEEPCNT, [4], 4) = 0
> > >
> > >// Set up an epoll operation to verify that connections terminated by failed
> > >// keepalives will wake up blocked epoll waiters with EPOLLERR|EPOLLHUP:
> > >   +0 epoll_create(1) = 5
> > >   +0 epoll_ctl(5, EPOLL_CTL_ADD, 4, {events=EPOLLERR, fd=4}) = 0
> > >   +0...11 epoll_wait(5, {events=EPOLLERR|EPOLLHUP, fd=4}, 1, 15000) = 1
> > >
> > >// Verify keepalive behavior looks correct, given the parameters above:
> > >
> > >// Start sending keepalive probes after 3 seconds of idle.
> > >   +3 > . 0:0(0) ack 1
> > >// Send keepalive probes every 2 seconds.
> > >   +2 > . 0:0(0) ack 1
> > >   +2 > . 0:0(0) ack 1
> > >   +2 > . 0:0(0) ack 1
> > >   +2 > R. 1:1(0) ack 1
> > >// Sent 4 keepalive probes and then gave up and reset the connection.
> > >
> > >// Verify that we get the expected error when we try to use the socket:
> > >   +0 read(4, ..., 1000) = -1 ETIMEDOUT (Connection timed out)