[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACSApvZyocduot4M6-FNoFpeL9vHeBd74HPWqfuQoOaQmkdqBg@mail.gmail.com>
Date: Sat, 23 Feb 2019 19:01:11 -0500
From: Soheil Hassas Yeganeh <soheil@...gle.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
Eric Dumazet <eric.dumazet@...il.com>,
Neal Cardwell <ncardwell@...gle.com>,
Yuchung Cheng <ycheng@...gle.com>,
syzbot <syzkaller@...glegroups.com>,
Andrey Vagin <avagin@...nvz.org>
Subject: Re: [PATCH net] tcp: repaired skbs must init their tso_segs
On Sat, Feb 23, 2019 at 6:51 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> syzbot reported a WARN_ON(!tcp_skb_pcount(skb))
> in tcp_send_loss_probe() [1]
>
> This was caused by TCP_REPAIR sent skbs that inadvertenly
> were missing a call to tcp_init_tso_segs()
>
> [1]
> WARNING: CPU: 1 PID: 0 at net/ipv4/tcp_output.c:2534 tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.0.0-rc7+ #77
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> <IRQ>
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x172/0x1f0 lib/dump_stack.c:113
> panic+0x2cb/0x65c kernel/panic.c:214
> __warn.cold+0x20/0x45 kernel/panic.c:571
> report_bug+0x263/0x2b0 lib/bug.c:186
> fixup_bug arch/x86/kernel/traps.c:178 [inline]
> fixup_bug arch/x86/kernel/traps.c:173 [inline]
> do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
> do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:290
> invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
> RIP: 0010:tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534
> Code: 88 fc ff ff 4c 89 ef e8 ed 75 c8 fb e9 c8 fc ff ff e8 43 76 c8 fb e9 63 fd ff ff e8 d9 75 c8 fb e9 94 f9 ff ff e8 bf 03 91 fb <0f> 0b e9 7d fa ff ff e8 b3 03 91 fb 0f b6 1d 37 43 7a 03 31 ff 89
> RSP: 0018:ffff8880ae907c60 EFLAGS: 00010206
> RAX: ffff8880a989c340 RBX: 0000000000000000 RCX: ffffffff85dedbdb
> RDX: 0000000000000100 RSI: ffffffff85dee0b1 RDI: 0000000000000005
> RBP: ffff8880ae907c90 R08: ffff8880a989c340 R09: ffffed10147d1ae1
> R10: ffffed10147d1ae0 R11: ffff8880a3e8d703 R12: ffff888091b90040
> R13: ffff8880a3e8d540 R14: 0000000000008000 R15: ffff888091b90860
> tcp_write_timer_handler+0x5c0/0x8a0 net/ipv4/tcp_timer.c:583
> tcp_write_timer+0x10e/0x1d0 net/ipv4/tcp_timer.c:607
> call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
> expire_timers kernel/time/timer.c:1362 [inline]
> __run_timers kernel/time/timer.c:1681 [inline]
> __run_timers kernel/time/timer.c:1649 [inline]
> run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
> __do_softirq+0x266/0x95a kernel/softirq.c:292
> invoke_softirq kernel/softirq.c:373 [inline]
> irq_exit+0x180/0x1d0 kernel/softirq.c:413
> exiting_irq arch/x86/include/asm/apic.h:536 [inline]
> smp_apic_timer_interrupt+0x14a/0x570 arch/x86/kernel/apic/apic.c:1062
> apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
> </IRQ>
> RIP: 0010:native_safe_halt+0x2/0x10 arch/x86/include/asm/irqflags.h:58
> Code: ff ff ff 48 89 c7 48 89 45 d8 e8 59 0c a1 fa 48 8b 45 d8 e9 ce fe ff ff 48 89 df e8 48 0c a1 fa eb 82 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90
> RSP: 0018:ffff8880a98afd78 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
> RAX: 1ffffffff1125061 RBX: ffff8880a989c340 RCX: 0000000000000000
> RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff8880a989cbbc
> RBP: ffff8880a98afda8 R08: ffff8880a989c340 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> R13: ffffffff889282f8 R14: 0000000000000001 R15: 0000000000000000
> arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:555
> default_idle_call+0x36/0x90 kernel/sched/idle.c:93
> cpuidle_idle_call kernel/sched/idle.c:153 [inline]
> do_idle+0x386/0x570 kernel/sched/idle.c:262
> cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:353
> start_secondary+0x404/0x5c0 arch/x86/kernel/smpboot.c:271
> secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
>
> Fixes: 79861919b889 ("tcp: fix TCP_REPAIR xmit queue setup")
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Reported-by: syzbot <syzkaller@...glegroups.com>
> Cc: Andrey Vagin <avagin@...nvz.org>
> Cc: Soheil Hassas Yeganeh <soheil@...gle.com>
> Cc: Neal Cardwell <ncardwell@...gle.com>
Acked-by: Soheil Hassas Yeganeh <soheil@...gle.com>
Very nice catch and fix! This was a really cryptic failure. Thank you Eric!
> ---
> net/ipv4/tcp_output.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 730bc44dbad9363814705b28c2f91a2253d91207..ccc78f3a4b60d3012430488bdfbcfc5122ff8627 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2347,6 +2347,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
> /* "skb_mstamp_ns" is used as a start point for the retransmit timer */
> skb->skb_mstamp_ns = tp->tcp_wstamp_ns = tp->tcp_clock_cache;
> list_move_tail(&skb->tcp_tsorted_anchor, &tp->tsorted_sent_queue);
> + tcp_init_tso_segs(skb, mss_now);
> goto repair; /* Skip network transmission */
> }
>
> --
> 2.21.0.rc0.258.g878e2cd30e-goog
>
Powered by blists - more mailing lists