lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iK+7CKO31=3EvNo6-raUzyibwRRN8HkNXeqzuP9q8k_tA@mail.gmail.com>
Date: Tue, 3 Dec 2024 12:07:05 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Youngmin Nam <youngmin.nam@...sung.com>
Cc: davem@...emloft.net, dsahern@...nel.org, kuba@...nel.org, 
	pabeni@...hat.com, horms@...nel.org, dujeong.lee@...sung.com, 
	guo88.liu@...sung.com, yiwang.cai@...sung.com, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org, joonki.min@...sung.com, hajun.sung@...sung.com, 
	d7271.choe@...sung.com, sw.ju@...sung.com, 
	Neal Cardwell <ncardwell@...gle.com>
Subject: Re: [PATCH] tcp: check socket state before calling WARN_ON

On Tue, Dec 3, 2024 at 9:10 AM Youngmin Nam <youngmin.nam@...sung.com> wrote:
>
> We encountered the following WARNINGs
> in tcp_sacktag_write_queue()/tcp_fastretrans_alert()
> which triggered a kernel panic due to panic_on_warn.
>
> case 1.
> ------------[ cut here ]------------
> WARNING: CPU: 4 PID: 453 at net/ipv4/tcp_input.c:2026
> Call trace:
>  tcp_sacktag_write_queue+0xae8/0xb60
>  tcp_ack+0x4ec/0x12b8
>  tcp_rcv_state_process+0x22c/0xd38
>  tcp_v4_do_rcv+0x220/0x300
>  tcp_v4_rcv+0xa5c/0xbb4
>  ip_protocol_deliver_rcu+0x198/0x34c
>  ip_local_deliver_finish+0x94/0xc4
>  ip_local_deliver+0x74/0x10c
>  ip_rcv+0xa0/0x13c
> Kernel panic - not syncing: kernel: panic_on_warn set ...
>
> case 2.
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 648 at net/ipv4/tcp_input.c:3004
> Call trace:
>  tcp_fastretrans_alert+0x8ac/0xa74
>  tcp_ack+0x904/0x12b8
>  tcp_rcv_state_process+0x22c/0xd38
>  tcp_v4_do_rcv+0x220/0x300
>  tcp_v4_rcv+0xa5c/0xbb4
>  ip_protocol_deliver_rcu+0x198/0x34c
>  ip_local_deliver_finish+0x94/0xc4
>  ip_local_deliver+0x74/0x10c
>  ip_rcv+0xa0/0x13c
> Kernel panic - not syncing: kernel: panic_on_warn set ...
>

I have not seen these warnings firing. Neal, have you seen this in the past ?

Please provide the kernel version (this must be a pristine LTS one).
and symbolized stack traces using scripts/decode_stacktrace.sh

If this warning was easy to trigger, please provide a packetdrill test ?



> When we check the socket state value at the time of the issue,
> it was 0x4.
>
> skc_state = 0x4,
>
> This is "TCP_FIN_WAIT1" and which means the device closed its socket.
>
> enum {
>         TCP_ESTABLISHED = 1,
>         TCP_SYN_SENT,
>         TCP_SYN_RECV,
>         TCP_FIN_WAIT1,
>
> And also this means tp->packets_out was initialized as 0
> by tcp_write_queue_purge().

What stack trace leads to this tcp_write_queue_purge() exactly ?

>
> In a congested network situation, a TCP ACK for
> an already closed session may be received with a delay from the peer.
> This can trigger the WARN_ON macro to help debug the situation.
>
> To make this situation more meaningful, we would like to call
> WARN_ON only when the state of the socket is "TCP_ESTABLISHED".
> This will prevent the kernel from triggering a panic
> due to panic_on_warn.
>
> Signed-off-by: Youngmin Nam <youngmin.nam@...sung.com>
> ---
>  net/ipv4/tcp_input.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 5bdf13ac26ef..62f4c285ab80 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2037,7 +2037,8 @@ tcp_sacktag_write_queue(struct sock *sk, const struct sk_buff *ack_skb,
>         WARN_ON((int)tp->sacked_out < 0);
>         WARN_ON((int)tp->lost_out < 0);
>         WARN_ON((int)tp->retrans_out < 0);
> -       WARN_ON((int)tcp_packets_in_flight(tp) < 0);
> +       if (sk->sk_state == TCP_ESTABLISHED)

In any case this test on sk_state is too specific.

> +               WARN_ON((int)tcp_packets_in_flight(tp) < 0);
>  #endif
>         return state->flag;
>  }
> @@ -3080,7 +3081,8 @@ static void tcp_fastretrans_alert(struct sock *sk, const u32 prior_snd_una,
>                 return;
>
>         /* C. Check consistency of the current state. */
> -       tcp_verify_left_out(tp);
> +       if (sk->sk_state == TCP_ESTABLISHED)
> +               tcp_verify_left_out(tp);
>
>         /* D. Check state exit conditions. State can be terminated
>          *    when high_seq is ACKed. */
> --
> 2.39.2
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ