netdev - Re: [PATCH] tcp: Avoid infinite loop on recvmsg bug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAODwPW_P_UNGV8rRC7hFC+PBrWeu64ODu5mTby=9GCqMnyDtdQ@mail.gmail.com>
Date:	Mon, 10 Dec 2012 11:33:27 -0800
From:	Julius Werner <jwerner@...omium.org>
To:	Dave Jones <davej@...hat.com>
Cc:	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	Patrick McHardy <kaber@...sh.net>,
	Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
	James Morris <jmorris@...ei.org>,
	Alexey Kuznetsov <kuznet@....inr.ac.ru>,
	Eric Dumazet <eric.dumazet@...il.com>,
	"David S. Miller" <davem@...emloft.net>,
	Sameer Nanda <snanda@...omium.org>,
	Mandeep Singh Baines <msb@...omium.org>
Subject: Re: [PATCH] tcp: Avoid infinite loop on recvmsg bug

Hi Dave,

Have you thought about picking up one of the patches to tcp_recvmsg I
proposed in this thread? We consider the underlying bug in Chromium OS
that led mere here to be fixed now, but I bet this will not be the
last time someone hits this code path and has to deal with the bad
error handling.

I understand that not everyone here agrees on what the best solution
is, but I think both of them are far better than the inconsistent and
potentially hard-disk-filling way that the current kernel does it.

On Wed, 2012-11-07 at 11:33 -0800, Julius Werner wrote:
> tcp_recvmsg contains a sanity check that WARNs when there is a gap
> between the socket's copied_seq and the first buffer in the
> sk_receive_queue. In theory, the TCP stack makes sure that This Should
> Never Happen (TM)... however, practice shows that there are still a few
> bug reports from it out there (and one in my inbox).
>
> Unfortunately, when it does happen for whatever reason, the situation
> is not handled very well: the kernel logs a warning and breaks out of
> the loop that walks the receive queue. It proceeds to find nothing else
> to do on the socket and hits sk_wait_data, which cannot block because
> the receive queue is not empty. As no data was read, the outer while
> loop repeats (logging the same warning again) ad infinitum until the
> system's syslog exhausts all available hard drive capacity.
>
> This patch addresses that issue by closing the socket outright and
> throwing EBADFD to userspace (which seems most appropriate to me at this
> point). As the underlying bug condition is "impossible" and therefore by
> definition unrecoverable, this is the only sensible action other than a
> full panic.
>
> Signed-off-by: Julius Werner <jwerner@...omium.org>
> ---
>  net/ipv4/tcp.c |    7 ++++++-
>  1 files changed, 6 insertions(+), 1 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 197c000..d612308 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1628,7 +1628,7 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
>                                "recvmsg bug: copied %X seq %X rcvnxt %X fl %X\n",
>                                *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt,
>                                flags))
> -                             break;
> +                             goto selfdestruct;
>
>                       offset = *seq - TCP_SKB_CB(skb)->seq;
>                       if (tcp_hdr(skb)->syn)
> @@ -1936,6 +1936,11 @@ recv_urg:
>  recv_sndq:
>       err = tcp_peek_sndq(sk, msg, len);
>       goto out;
> +
> +selfdestruct:
> +     err = -EBADFD;
> +     tcp_done(sk);
> +     goto out;
>  }
>  EXPORT_SYMBOL(tcp_recvmsg);
>

On Tue, Nov 06, 2012 at 04:15:35PM -0800, Julius Werner wrote:
> tcp_recvmsg contains a sanity check that WARNs when there is a gap
> between the socket's copied_seq and the first buffer in the
> sk_receive_queue. In theory, the TCP stack makes sure that This Should
> Never Happen (TM)... however, practice shows that there are still a few
> bug reports from it out there (and one in my inbox).
>
> Unfortunately, when it does happen for whatever reason, the situation
> is not handled very well: the kernel logs a warning and breaks out of
> the loop that walks the receive queue. It proceeds to find nothing else
> to do on the socket and hits sk_wait_data, which cannot block because
> the receive queue is not empty. As no data was read, the outer while
> loop repeats (logging the same warning again) ad infinitum until the
> system's syslog exhausts all available hard drive capacity.
>
> This patch improves that behavior by going straight to a proper kernel
> crash. The cause of the error can be identified right away and the
> system's hard drive is not unnecessarily strained.
>
> Signed-off-by: Julius Werner <jwerner@...omium.org>
> ---
>  net/ipv4/tcp.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 197c000..fcb0927 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1628,7 +1628,7 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
>                               "recvmsg bug: copied %X seq %X rcvnxt %X fl %X\n",
>                               *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt,
>                               flags))
> -                            break;
> +                            BUG();
>
>                      offset = *seq - TCP_SKB_CB(skb)->seq;
>                      if (tcp_hdr(skb)->syn)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html