lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALMXkpZfufqWhvd9F4kbtC18bFYCgNWrkEvL7Tw976i24f1EFw@mail.gmail.com>
Date:   Wed, 11 Sep 2019 10:36:45 -0700
From:   Christoph Paasch <christoph.paasch@...il.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     "David S . Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Soheil Hassas Yeganeh <soheil@...gle.com>,
        Neal Cardwell <ncardwell@...gle.com>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Jason Baron <jbaron@...mai.com>,
        Vladimir Rutsky <rutsky@...gle.com>
Subject: Re: [PATCH net] tcp: remove empty skb from write queue in error cases

Hello,

On Mon, Aug 26, 2019 at 11:04 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> Vladimir Rutsky reported stuck TCP sessions after memory pressure
> events. Edge Trigger epoll() user would never receive an EPOLLOUT
> notification allowing them to retry a sendmsg().
>
> Jason tested the case of sk_stream_alloc_skb() returning NULL,
> but there are other paths that could lead both sendmsg() and sendpage()
> to return -1 (EAGAIN), with an empty skb queued on the write queue.
>
> This patch makes sure we remove this empty skb so that
> Jason code can detect that the queue is empty, and
> call sk->sk_write_space(sk) accordingly.
>
> Fixes: ce5ec440994b ("tcp: ensure epoll edge trigger wakeup when write queue is empty")
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Cc: Jason Baron <jbaron@...mai.com>
> Reported-by: Vladimir Rutsky <rutsky@...gle.com>
> Cc: Soheil Hassas Yeganeh <soheil@...gle.com>
> Cc: Neal Cardwell <ncardwell@...gle.com>
> ---
>  net/ipv4/tcp.c | 30 ++++++++++++++++++++----------
>  1 file changed, 20 insertions(+), 10 deletions(-)

I got syzkaller complaining now on 4.14.143 with the following reproducer:

# {Threaded:true Collide:true Repeat:true RepeatTimes:0 Procs:1
Sandbox: Fault:false FaultCall:-1 FaultNth:0 EnableTun:false
UseTmpDir:false EnableCgroups:false EnableNetdev:false ResetNet:false
HandleSegv:false Repro:false Trace:false}
r0 = socket$inet_tcp(0x2, 0x1, 0x0)
setsockopt$inet_tcp_TCP_REPAIR(r0, 0x6, 0x13, &(0x7f0000000040)=0x1, 0x4)
setsockopt$inet_tcp_TCP_REPAIR_QUEUE(r0, 0x6, 0x14, &(0x7f00000012c0)=0x2, 0x4)
setsockopt$inet_tcp_int(r0, 0x6, 0x19, &(0x7f0000000000)=0x9, 0x4)
setsockopt$inet_tcp_TCP_MD5SIG(r0, 0x6, 0xe,
&(0x7f00000001c0)={@...{{0x2, 0x0, @empty}}, 0x0, 0x2, 0x0,
"c157cf4809151e5e89cfd6d934fbe981ec8ff6afc252ccf486c325c7ff3d35f3a89412a5cb6430e169092617df2ba65bf0ab844572e4e7dd4ece8ec1de5ac1ccd870067b018cb3b1f05f2391d872b67d"},
0xd8)
connect$inet(r0, &(0x7f0000000080)={0x2, 0x0, @dev={0xac, 0x14, 0x14,
0x1d}}, 0x10)
sendto(r0, 0x0, 0x87, 0x0, 0x0, 0x391)

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN PTI
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 2529 Comm: syz-executor709 Not tainted 4.14.143 #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
task: ffff8880677fdc00 task.stack: ffff8880642b0000
RIP: 0010:tcp_sendmsg_locked+0x6b4/0x4390 net/ipv4/tcp.c:1350
RSP: 0018:ffff8880642bf718 EFLAGS: 00010206
RAX: 0000000000000014 RBX: 0000000000000087 RCX: ffff88806a794f50
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000000a0
RBP: ffff8880642bfaa8 R08: 0000000000000006 R09: ffff8880677fe3a0
R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
R13: ffff88806a794f50 R14: ffff88806a794d00 R15: 0000000000000087
FS:  00007f644b697700(0000) GS:ffff88806cf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffcd37370b0 CR3: 00000000679f2006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 tcp_sendmsg+0x2a/0x40 net/ipv4/tcp.c:1533
 inet_sendmsg+0x173/0x4e0 net/ipv4/af_inet.c:784
 sock_sendmsg_nosec net/socket.c:646 [inline]
 sock_sendmsg+0xc3/0x100 net/socket.c:656
 SYSC_sendto+0x35d/0x5e0 net/socket.c:1766
 do_syscall_64+0x241/0x680 arch/x86/entry/common.c:292
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x7f644afc6469
RSP: 002b:00007f644b696f28 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000602130 RCX: 00007f644afc6469
RDX: 0000000000000087 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000602138 R08: 0000000000000000 R09: 0000000000000391
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000060213c
R13: 00007ffcd373700f R14: 00007f644b677000 R15: 0000000000000003
Code: 74 08 3c 03 0f 8e f1 32 00 00 8b 85 98 fd ff ff 89 85 60 fd ff
ff 48 8b 85 70 fd ff ff 48 8d b8 a0 00 00 00 48 89 f8 48 c1 e8 03 <42>
0f b6 04 20 84 c0 74 06 0f 8e d2 32 00 00 4c 8b bd 70 fd ff
RIP: tcp_sendmsg_locked+0x6b4/0x4390 net/ipv4/tcp.c:1350 RSP: ffff8880642bf718
---[ end trace 70f07f242cd3b9d8 ]---


It's because skb is NULL in tcp_sendmsg_locked at:
                  skb = tcp_write_queue_tail(sk);
                  if (tcp_send_head(sk)) {
                          if (skb->ip_summed == CHECKSUM_NONE)


I think we need this here on pre-rb-tree kernels :

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5ce069ce2a97..efe767e20d01 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -924,8 +924,7 @@ static void tcp_remove_empty_skb(struct sock *sk,
struct sk_buff *skb)
 {
  if (skb && !skb->len) {
  tcp_unlink_write_queue(skb, sk);
- if (tcp_write_queue_empty(sk))
- tcp_chrono_stop(sk, TCP_CHRONO_BUSY);
+ tcp_check_send_head(sk, skb);
  sk_wmem_free_skb(sk, skb);
  }
 }

Does that look good?


Thanks,
Christoph

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ