lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250617144017.82931-1-maxim@isovalent.com>
Date: Tue, 17 Jun 2025 16:39:59 +0200
From: Maxim Mikityanskiy <maxtram95@...il.com>
To: Daniel Borkmann <daniel@...earbox.net>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>,
	Paolo Abeni <pabeni@...hat.com>,
	Willem de Bruijn <willemdebruijn.kernel@...il.com>,
	David Ahern <dsahern@...nel.org>,
	Nikolay Aleksandrov <razor@...ckwall.org>
Cc: netdev@...r.kernel.org,
	Maxim Mikityanskiy <maxim@...valent.com>
Subject: [PATCH RFC net-next 00/17] BIG TCP for UDP tunnels

This series consists of two parts that will be submitted separately:

01-11: Remove hop-by-hop header for BIG TCP IPv6.
12-17: Fix up things that prevent BIG TCP from working with tunnels.

I kept them both here for the sake of big picture.

There are a few places that make assumptions about skb->len being
smaller than 64k and/or that store it in 16-bit fields, trimming the
length. The first step to enable BIG TCP with VXLAN and GENEVE tunnels
is to patch those places to handle bigger lengths properly (patches
12-17). This is enough to make IPv4 in IPv4 work with BIG TCP, but when
either the outer or the inner protocol is IPv6, the current BIG TCP code
inserts a hop-by-hop extension header that stores the actual 32-bit
length of the packet.

This additional hop-by-hop header turns out problematic for encapsulated
cases, because:

1. The drivers don't strip it, and they'd all need to know the structure
of each tunnel protocol in order to strip it correctly.

2. Even if (1) is implemented, it would be an additional performance
penalty per aggregated packet.

3. The skb_gso_validate_network_len check is skipped in
ip6_finish_output_gso when IP6SKB_FAKEJUMBO is set, but it seems that it
would make sense to do the actual validation, just taking into account
the length of the HBH header. When the support for tunnels is added, it
becomes trickier, because there may be one or two HBH headers, depending
on whether it's IPv6 in IPv6 or not.

At the same time, having an HBH header to store the 32-bit length is not
strictly necessary, as BIG TCP IPv4 doesn't do anything like this and
just restores the length from skb->len. The same thing can be done for
BIG TCP IPv6 (patches 01-11).

The only reason why we keep inserting HBH seems to be for the tools that
parse the packets, but the above drawbacks seem to outweigh this, and
the tools can be patched (like they need to, in order to be able to
parse BIG TCP IPv4 now). I have a patch for tcpdump.

Removing HBH from BIG TCP would allow to simplify the implementation
significantly, and align it with BIG TCP IPv4.

Daniel Borkmann (1):
  geneve: Enable BIG TCP packets

Maxim Mikityanskiy (16):
  net/ipv6: Introduce payload_len helpers
  net/ipv6: Drop HBH for BIG TCP on TX side
  net/ipv6: Drop HBH for BIG TCP on RX side
  net/ipv6: Remove jumbo_remove step from TX path
  net/mlx5e: Remove jumbo_remove step from TX path
  net/mlx4: Remove jumbo_remove step from TX path
  ice: Remove jumbo_remove step from TX path
  bnxt_en: Remove jumbo_remove step from TX path
  gve: Remove jumbo_remove step from TX path
  net: mana: Remove jumbo_remove step from TX path
  net/ipv6: Remove HBH helpers
  net: Enable BIG TCP with partial GSO
  udp: Support gro_ipv4_max_size > 65536
  udp: Validate UDP length in udp_gro_receive
  udp: Set length in UDP header to 0 for big GSO packets
  vxlan: Enable BIG TCP packets

 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 21 -----
 drivers/net/ethernet/google/gve/gve_tx_dqo.c  |  3 -
 drivers/net/ethernet/intel/ice/ice_txrx.c     |  3 -
 drivers/net/ethernet/mellanox/mlx4/en_tx.c    | 42 ++--------
 .../net/ethernet/mellanox/mlx5/core/en_tx.c   | 75 +++---------------
 drivers/net/ethernet/microsoft/mana/mana_en.c |  3 -
 drivers/net/geneve.c                          |  2 +
 drivers/net/vxlan/vxlan_core.c                |  2 +
 include/linux/ipv6.h                          | 21 ++++-
 include/net/ipv6.h                            | 79 -------------------
 include/net/netfilter/nf_tables_ipv6.h        |  4 +-
 net/bridge/br_netfilter_ipv6.c                |  2 +-
 net/bridge/netfilter/nf_conntrack_bridge.c    |  4 +-
 net/core/dev.c                                |  3 +-
 net/core/gro.c                                |  2 -
 net/core/skbuff.c                             | 10 +--
 net/ipv4/udp.c                                |  5 +-
 net/ipv4/udp_offload.c                        | 12 ++-
 net/ipv4/udp_tunnel_core.c                    |  2 +-
 net/ipv6/ip6_input.c                          |  2 +-
 net/ipv6/ip6_offload.c                        | 36 +--------
 net/ipv6/ip6_output.c                         | 20 +----
 net/ipv6/ip6_udp_tunnel.c                     |  2 +-
 net/ipv6/output_core.c                        |  7 +-
 net/netfilter/ipvs/ip_vs_xmit.c               |  2 +-
 net/netfilter/nf_conntrack_ovs.c              |  2 +-
 net/netfilter/nf_log_syslog.c                 |  2 +-
 net/sched/sch_cake.c                          |  2 +-
 28 files changed, 83 insertions(+), 287 deletions(-)

-- 
2.49.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ