netdev - Re: [PATCH net] tcp: repaired skbs must init their tso

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190226092302.GA25925@gmail.com>
Date:   Tue, 26 Feb 2019 01:23:05 -0800
From:   Andrei Vagin <avagin@...il.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     "David S . Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Soheil Hassas Yeganeh <soheil@...gle.com>,
        Neal Cardwell <ncardwell@...gle.com>,
        Yuchung Cheng <ycheng@...gle.com>,
        syzbot <syzkaller@...glegroups.com>,
        Andrey Vagin <avagin@...nvz.org>
Subject: Re: [PATCH net] tcp: repaired skbs must init their tso_segs

On Sat, Feb 23, 2019 at 03:51:51PM -0800, Eric Dumazet wrote:
> syzbot reported a WARN_ON(!tcp_skb_pcount(skb))
> in tcp_send_loss_probe() [1]
> 
> This was caused by TCP_REPAIR sent skbs that inadvertenly
> were missing a call to tcp_init_tso_segs()
> 
> [1]
> WARNING: CPU: 1 PID: 0 at net/ipv4/tcp_output.c:2534 tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.0.0-rc7+ #77
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
>  panic+0x2cb/0x65c kernel/panic.c:214
>  __warn.cold+0x20/0x45 kernel/panic.c:571
>  report_bug+0x263/0x2b0 lib/bug.c:186
>  fixup_bug arch/x86/kernel/traps.c:178 [inline]
>  fixup_bug arch/x86/kernel/traps.c:173 [inline]
>  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
>  do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:290
>  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
> RIP: 0010:tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534
> Code: 88 fc ff ff 4c 89 ef e8 ed 75 c8 fb e9 c8 fc ff ff e8 43 76 c8 fb e9 63 fd ff ff e8 d9 75 c8 fb e9 94 f9 ff ff e8 bf 03 91 fb <0f> 0b e9 7d fa ff ff e8 b3 03 91 fb 0f b6 1d 37 43 7a 03 31 ff 89
> RSP: 0018:ffff8880ae907c60 EFLAGS: 00010206
> RAX: ffff8880a989c340 RBX: 0000000000000000 RCX: ffffffff85dedbdb
> RDX: 0000000000000100 RSI: ffffffff85dee0b1 RDI: 0000000000000005
> RBP: ffff8880ae907c90 R08: ffff8880a989c340 R09: ffffed10147d1ae1
> R10: ffffed10147d1ae0 R11: ffff8880a3e8d703 R12: ffff888091b90040
> R13: ffff8880a3e8d540 R14: 0000000000008000 R15: ffff888091b90860
>  tcp_write_timer_handler+0x5c0/0x8a0 net/ipv4/tcp_timer.c:583
>  tcp_write_timer+0x10e/0x1d0 net/ipv4/tcp_timer.c:607
>  call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
>  expire_timers kernel/time/timer.c:1362 [inline]
>  __run_timers kernel/time/timer.c:1681 [inline]
>  __run_timers kernel/time/timer.c:1649 [inline]
>  run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
>  __do_softirq+0x266/0x95a kernel/softirq.c:292
>  invoke_softirq kernel/softirq.c:373 [inline]
>  irq_exit+0x180/0x1d0 kernel/softirq.c:413
>  exiting_irq arch/x86/include/asm/apic.h:536 [inline]
>  smp_apic_timer_interrupt+0x14a/0x570 arch/x86/kernel/apic/apic.c:1062
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
>  </IRQ>
> RIP: 0010:native_safe_halt+0x2/0x10 arch/x86/include/asm/irqflags.h:58
> Code: ff ff ff 48 89 c7 48 89 45 d8 e8 59 0c a1 fa 48 8b 45 d8 e9 ce fe ff ff 48 89 df e8 48 0c a1 fa eb 82 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90
> RSP: 0018:ffff8880a98afd78 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
> RAX: 1ffffffff1125061 RBX: ffff8880a989c340 RCX: 0000000000000000
> RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff8880a989cbbc
> RBP: ffff8880a98afda8 R08: ffff8880a989c340 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> R13: ffffffff889282f8 R14: 0000000000000001 R15: 0000000000000000
>  arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:555
>  default_idle_call+0x36/0x90 kernel/sched/idle.c:93
>  cpuidle_idle_call kernel/sched/idle.c:153 [inline]
>  do_idle+0x386/0x570 kernel/sched/idle.c:262
>  cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:353
>  start_secondary+0x404/0x5c0 arch/x86/kernel/smpboot.c:271
>  secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
> 

Thank you Eric. I saw a few test fails when tcp_peek_sndq()
returned more data than we expected. I have executed the test with this
fix in a loop and it works without any problem. Without this fix, it
fails after a few iteration.

https://github.com/checkpoint-restore/criu/issues/622

> Fixes: 79861919b889 ("tcp: fix TCP_REPAIR xmit queue setup")
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Reported-by: syzbot <syzkaller@...glegroups.com>
> Cc: Andrey Vagin <avagin@...nvz.org>
> Cc: Soheil Hassas Yeganeh <soheil@...gle.com>
> Cc: Neal Cardwell <ncardwell@...gle.com>
> ---
>  net/ipv4/tcp_output.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 730bc44dbad9363814705b28c2f91a2253d91207..ccc78f3a4b60d3012430488bdfbcfc5122ff8627 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2347,6 +2347,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
>  			/* "skb_mstamp_ns" is used as a start point for the retransmit timer */
>  			skb->skb_mstamp_ns = tp->tcp_wstamp_ns = tp->tcp_clock_cache;
>  			list_move_tail(&skb->tcp_tsorted_anchor, &tp->tsorted_sent_queue);
> +			tcp_init_tso_segs(skb, mss_now);
>  			goto repair; /* Skip network transmission */
>  		}
>  
> -- 
> 2.21.0.rc0.258.g878e2cd30e-goog
>