lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <af01ddc5-82bc-4376-9874-e465bf5424f7@mails.ucas.ac.cn>
Date: Tue, 22 Apr 2025 18:48:44 +0800
From: Qiyu Yan <yanqiyu17@...ls.ucas.ac.cn>
To: Tariq Toukan <tariqt@...dia.com>, Saeed Mahameed <saeedm@...dia.com>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 Simon Horman <horms@...nel.org>, Eric Dumazet <edumazet@...gle.com>,
 "David S. Miller" <davem@...emloft.net>
Cc: netdev@...r.kernel.org
Subject: DNAT'ed traffic from ConnectX-4 card triggers "hw csum failure" on
 veth interface

Hi all,

Apologies for the broad CC—I'm unsure which component is related to the 
issue, but I've gathered more details since my last report.

After boot or after resetting the WARN_ONCE flag, I consistently observe 
the following in `dmesg`:

eth0: hw csum failure
skb len=52 headroom=98 headlen=52 tailroom=1578
mac=(64,14) mac_len=14 net=(78,20) trans=98
shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
csum(0x98009d14 start=40212 offset=38912 ip_summed=2 complete_sw=0 
valid=0 level=0)
hash(0x2135374 sw=0 l4=1) proto=0x0800 pkttype=0 iif=2
priority=0x0 mark=0x0 alloc_cpu=20 vlan_all=0x0
encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0)
dev name=eth0 feat=0x000061164fdd09e9
skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb headroom: 00000030: ba d7 32 44 dd 39 7e b7 bb bd 2e d5 88 e5 2d 00
skb headroom: 00000040: 9e 52 9b 58 46 89 aa 93 51 02 83 7e 08 00 45 00
skb headroom: 00000050: 00 48 d3 6d 00 00 3f 11 93 48 0a 00 00 7a 0a 58
skb headroom: 00000060: 00 1e
skb linear:   00000000: e2 e4 00 35 00 34 92 e9 f4 39 01 00 00 01 00 00
skb linear:   00000010: 00 00 00 00 06 72 65 70 6f 72 74 07 6d 65 65 74
skb linear:   00000020: 69 6e 67 07 74 65 6e 63 65 6e 74 03 63 6f 6d 00
skb linear:   00000030: 00 01 00 01
... large tailroom
CPU: 20 UID: 0 PID: 0 Comm: swapper/20 Tainted: G OE      
6.14.2-300.fc42.x86_64 #1
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./EPYCD8, 
BIOS L2.52 11/25/2020
Call Trace:
  <IRQ>
  dump_stack_lvl+0x5d/0x80
  __skb_checksum_complete+0xeb/0x110
  ? __pfx_csum_partial_ext+0x10/0x10
  ? __pfx_csum_block_add_ext+0x10/0x10
  udp4_csum_init+0x1dc/0x2f0
  __udp4_lib_rcv+0xc8/0x750
  ? srso_return_thunk+0x5/0x5f
  ? raw_v4_input+0x14a/0x270
  ip_protocol_deliver_rcu+0xcb/0x1a0
  ip_local_deliver_finish+0x76/0xa0
  ip_local_deliver+0xfa/0x110
  __netif_receive_skb_one_core+0x87/0xa0
  process_backlog+0x87/0x130
  __napi_poll+0x31/0x1b0
  ? srso_return_thunk+0x5/0x5f
  net_rx_action+0x333/0x420
  handle_softirqs+0xf2/0x340
  ? srso_return_thunk+0x5/0x5f
  ? srso_return_thunk+0x5/0x5f
  __irq_exit_rcu+0xcb/0xf0
  common_interrupt+0x85/0xa0
  </IRQ>
  <TASK>
  asm_common_interrupt+0x26/0x40
RIP: 0010:cpuidle_enter_state+0xcc/0x660
Code: 00 00 e8 67 28 fb fe e8 d2 ed ff ff 49 89 c4 0f 1f 44 00 00 31 ff 
e8 73 61 f9 fe 45 84 ff 0f 85 02 02 00 00 fb 0f 1f 44 00 00 <85> ed 0f 
88 d3 01 00 00 4c 63 f5 49 83 fe 0a 0f 83 9f 04 00 00 49
RSP: 0018:ffffa79d003afe50 EFLAGS: 00000246
RAX: ffff96440ca00000 RBX: ffff962542b89800 RCX: 0000000000000000
RDX: 000051a9557f7bf1 RSI: 000000003152c088 RDI: 0000000000000000
RBP: 0000000000000002 R08: ffffffee4d207359 R09: ffff96440ca315e0
R10: 000051bb10ea059b R11: 0000000000000000 R12: 000051a9557f7bf1
R13: ffffffffa7b15160 R14: 0000000000000002 R15: 0000000000000000

 From inspecting the SKB, the packet comes from a host (10.0.0.122) 
connected via a ConnectX-4 Lx NIC to our server. It is DNAT'ed via 
iptables from 10.0.0.1:53 to a container at 10.88.0.30:53.

Traffic path:

     10.0.0.122 --> [CX4 NIC 10.0.0.1/16]
                       |
               iptables DNAT (10.0.0.1:53 -> 10.88.0.30:53)
                       |
                 [linux bridge (podman0 10.88.0.1/16)]
                       |
                   [veth pair]
                       |
                 [eth0 inside container]

The warning is triggered when the packet arrives at eth0 inside the 
container.

What's suspicious is the reported checksum info:

     csum(0x9800a314 start=41748 offset=38912 ip_summed=2 ...)

Here, start and offset are far beyond the size of the skb. This seems 
like an invalid buffer? And I suspect that during DNAT and/or forwarding 
through the bridge and veth, the checksum status is not properly cleared 
or recalculated.

The NIC is:
$ ethtool -i mlx-p1
driver: mlx5_core
version: 6.14.2-300.fc42.x86_64
firmware-version: 14.32.1900 (MT_2420110034)
expansion-rom-version:
bus-info: 0000:c1:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes


Best,
Qiyu


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ