lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.OSX.2.22.394.2004291151040.42484@jdfowle1-mobl1.amr.corp.intel.com>
Date:   Wed, 29 Apr 2020 11:54:02 -0700 (PDT)
From:   Mat Martineau <mathew.j.martineau@...ux.intel.com>
To:     Paolo Abeni <pabeni@...hat.com>
cc:     netdev@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>,
        Matthieu Baerts <matthieu.baerts@...sares.net>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Christoph Paasch <cpaasch@...le.com>, mptcp@...ts.01.org
Subject: Re: [PATCH net 2/5] mptcp: move option parsing into
 mptcp_incoming_options()


On Wed, 29 Apr 2020, Paolo Abeni wrote:

> The mptcp_options_received structure carries several per
> packet flags (mp_capable, mp_join, etc.). Such fields must
> be cleared on each packet, even on dropped ones or packet
> not carrying any MPTCP options, but the current mptcp
> code clears them only on TCP option reset.
>
> On several races/corner cases we end-up with stray bits in
> incoming options, leading to WARN_ON splats. e.g.:
>
> [  171.164906] Bad mapping: ssn=32714 map_seq=1 map_data_len=32713
> [  171.165006] WARNING: CPU: 1 PID: 5026 at net/mptcp/subflow.c:533 warn_bad_map (linux-mptcp/net/mptcp/subflow.c:533 linux-mptcp/net/mptcp/subflow.c:531)
> [  171.167632] Modules linked in: ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel geneve ip6_udp_tunnel udp_tunnel macsec macvtap tap ipvlan macvlan 8021q garp mrp xfrm_interface veth netdevsim nlmon dummy team bonding vcan bridge stp llc ip6_gre gre ip6_tunnel tunnel6 tun binfmt_misc intel_rapl_msr intel_rapl_common rfkill kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev virtio_balloon pcspkr i2c_piix4 sunrpc ip_tables xfs libcrc32c crc32c_intel serio_raw virtio_console ata_generic virtio_blk virtio_net net_failover failover ata_piix libata
> [  171.199464] CPU: 1 PID: 5026 Comm: repro Not tainted 5.7.0-rc1.mptcp_f227fdf5d388+ #95
> [  171.200886] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
> [  171.202546] RIP: 0010:warn_bad_map (linux-mptcp/net/mptcp/subflow.c:533 linux-mptcp/net/mptcp/subflow.c:531)
> [  171.206537] Code: c1 ea 03 0f b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2 75 1d 8b 55 3c 44 89 e6 48 c7 c7 20 51 13 95 e8 37 8b 22 fe <0f> 0b 48 83 c4 08 5b 5d 41 5c c3 89 4c 24 04 e8 db d6 94 fe 8b 4c
> [  171.220473] RSP: 0018:ffffc90000150560 EFLAGS: 00010282
> [  171.221639] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [  171.223108] RDX: 0000000000000000 RSI: 0000000000000008 RDI: fffff5200002a09e
> [  171.224388] RBP: ffff8880aa6e3c00 R08: 0000000000000001 R09: fffffbfff2ec9955
> [  171.225706] R10: ffffffff9764caa7 R11: fffffbfff2ec9954 R12: 0000000000007fca
> [  171.227211] R13: ffff8881066f4a7f R14: ffff8880aa6e3c00 R15: 0000000000000020
> [  171.228460] FS:  00007f8623719740(0000) GS:ffff88810be00000(0000) knlGS:0000000000000000
> [  171.230065] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  171.231303] CR2: 00007ffdab190a50 CR3: 00000001038ea006 CR4: 0000000000160ee0
> [  171.232586] Call Trace:
> [  171.233109]  <IRQ>
> [  171.233531] get_mapping_status (linux-mptcp/net/mptcp/subflow.c:691)
> [  171.234371] mptcp_subflow_data_available (linux-mptcp/net/mptcp/subflow.c:736 linux-mptcp/net/mptcp/subflow.c:832)
> [  171.238181] subflow_state_change (linux-mptcp/net/mptcp/subflow.c:1085 (discriminator 1))
> [  171.239066] tcp_fin (linux-mptcp/net/ipv4/tcp_input.c:4217)
> [  171.240123] tcp_data_queue (linux-mptcp/./include/linux/compiler.h:199 linux-mptcp/net/ipv4/tcp_input.c:4822)
> [  171.245083] tcp_rcv_established (linux-mptcp/./include/linux/skbuff.h:1785 linux-mptcp/./include/net/tcp.h:1774 linux-mptcp/./include/net/tcp.h:1847 linux-mptcp/net/ipv4/tcp_input.c:5238 linux-mptcp/net/ipv4/tcp_input.c:5730)
> [  171.254089] tcp_v4_rcv (linux-mptcp/./include/linux/spinlock.h:393 linux-mptcp/net/ipv4/tcp_ipv4.c:2009)
> [  171.258969] ip_protocol_deliver_rcu (linux-mptcp/net/ipv4/ip_input.c:204 (discriminator 1))
> [  171.260214] ip_local_deliver_finish (linux-mptcp/./include/linux/rcupdate.h:651 linux-mptcp/net/ipv4/ip_input.c:232)
> [  171.261389] ip_local_deliver (linux-mptcp/./include/linux/netfilter.h:307 linux-mptcp/./include/linux/netfilter.h:301 linux-mptcp/net/ipv4/ip_input.c:252)
> [  171.265884] ip_rcv (linux-mptcp/./include/linux/netfilter.h:307 linux-mptcp/./include/linux/netfilter.h:301 linux-mptcp/net/ipv4/ip_input.c:539)
> [  171.273666] process_backlog (linux-mptcp/./include/linux/rcupdate.h:651 linux-mptcp/net/core/dev.c:6135)
> [  171.275328] net_rx_action (linux-mptcp/net/core/dev.c:6572 linux-mptcp/net/core/dev.c:6640)
> [  171.280472] __do_softirq (linux-mptcp/./arch/x86/include/asm/jump_label.h:25 linux-mptcp/./include/linux/jump_label.h:200 linux-mptcp/./include/trace/events/irq.h:142 linux-mptcp/kernel/softirq.c:293)
> [  171.281379] do_softirq_own_stack (linux-mptcp/arch/x86/entry/entry_64.S:1083)
> [  171.282358]  </IRQ>
>
> We could address the issue clearing explicitly the relevant fields
> in several places - tcp_parse_option, tcp_fast_parse_options,
> possibly others.
>
> Instead we move the MPTCP option parsing into the already existing
> mptcp ingress hook, so that we need to clear the fields in a single
> place.
>
> This allows us dropping an MPTCP hook from the TCP code and
> removing the quite large mptcp_options_received from the tcp_sock
> struct. On the flip side, the MPTCP sockets will traverse the
> option space twice (in tcp_parse_option() and in
> mptcp_incoming_options(). That looks acceptable: we already
> do that for syn and 3rd ack packets, plain TCP socket will
> benefit from it, and even MPTCP sockets will experience better
> code locality, reducing the jumps between TCP and MPTCP code.
>
> Fixes: 648ef4b88673 ("mptcp: Implement MPTCP receive path")
> Signed-off-by: Paolo Abeni <pabeni@...hat.com>
> ---

Hi Paolo -

This doesn't apply cleanly to the net tree.


--
Mat Martineau
Intel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ