lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <36566e86-57df-40f3-80ff-7833f930311d@gmail.com>
Date: Sun, 6 Apr 2025 22:35:26 +0800
From: Qiyu Yan <yanqiyu01@...il.com>
To: netdev@...r.kernel.org
Cc: "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
 netdev@...r.kernel.org
Subject: [bug?] "hw csum failure" warning triggered on veth interface

Dear linux network maintainers,

I'm encountering consistent |hw csum failure| warnings during system 
boot. Here's an example from a recent log (running stock kernel 
6.14.0-63.fc42.x86_64 from Fedora 42 pre-release):

[   74.128126] (NULL net_device): hw csum failure
[   74.128149] skb len=545 headroom=98 headlen=545 tailroom=61
                mac=(64,14) mac_len=14 net=(78,20) trans=98
                shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
                csum(0x9edfcad start=64685 offset=2541 ip_summed=2 
complete_sw=0 valid=0 level=0)
                hash(0x5c58e98 sw=0 l4=1) proto=0x0800 pkttype=0 iif=3
                priority=0x0 mark=0x0 alloc_cpu=26 vlan_all=0x0
                encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0)
[   74.128178] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00
[   74.128188] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00
[   74.128197] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00
[   74.128205] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00
[   74.128214] skb headroom: 00000040: 72 30 8d ae 4f 32 e2 a4 be b5 59 
db 08 00 45 00
[   74.128222] skb headroom: 00000050: 02 35 d2 65 40 00 33 06 da 7e a3 
7d eb 05 0a 58
[   74.128230] skb headroom: 00000060: 00 04
[   74.128239] skb linear:   00000000: e5 80 72 46 8c 57 20 0f af 05 eb 
53 50 18 04 04
[   74.128247] skb linear:   00000010: c3 91 00 00 4b 75 31 58 8e c6 71 
48 84 68 65 07
[   74.128255] skb linear:   00000020: fe a6 6f e7 cd 8c 64 a0 4e f6 2b 
f3 eb 61 d7 68
[   74.128263] skb linear:   00000030: 8e a9 0f b6 67 66 be 92 c1 11 f9 
72 58 38 21 1e
[   74.128271] skb linear:   00000040: c3 93 b6 3d 73 ec 70 46 a6 cf 56 
e6 c2 eb 02 26
[   74.128280] skb linear:   00000050: 1e 61 9c 28 70 15 b3 d3 8f ba e4 
b0 7f b7 3a 43
[   74.128288] skb linear:   00000060: 5f 18 6e d2 1c 1a 6d 31 f1 02 70 
01 3e b8 b8 da
[   74.128296] skb linear:   00000070: ed 17 c8 be 1c ae 94 c0 90 54 e2 
5d 6b f0 c4 d1
[   74.128303] skb linear:   00000080: 02 96 d1 e8 3e 9a df b3 42 a3 c6 
36 4d 01 67 61
[   74.128311] skb linear:   00000090: e2 41 ed 42 27 fe 53 78 8c fa 27 
eb ac 6d 8d ba
[   74.128319] skb linear:   000000a0: 78 9c 86 75 92 ae 72 8d f7 bb d4 
08 e1 27 56 79
[   74.128327] skb linear:   000000b0: ec 2e 0d 30 77 bf fd ae 4d 8e e0 
5c 85 65 23 7c
[   74.128334] skb linear:   000000c0: a6 ba 32 5f 0f 87 f5 d8 96 56 9a 
f2 70 9b 96 de
[   74.128342] skb linear:   000000d0: 51 47 e6 2f d3 9a 9b 4a 1c 39 95 
17 bb 80 8f fd
[   74.128349] skb linear:   000000e0: d4 19 5c 0e 7d ce 6f 7e 67 9b a1 
5a c1 08 2f 76
[   74.128357] skb linear:   000000f0: 59 b6 02 a8 05 37 34 33 41 22 cf 
86 19 67 d8 27
[   74.128364] skb linear:   00000100: 4a e1 8c ea a4 2a e9 66 b2 b3 70 
a9 9d 14 2a 2b
[   74.128373] skb linear:   00000110: 4e a0 e9 01 d3 3d d0 53 04 73 15 
10 66 c2 06 e0
[   74.128380] skb linear:   00000120: 4f 39 4a 5b 4b 44 6a 78 bf c6 90 
48 cc 67 8e e4
[   74.128388] skb linear:   00000130: 76 30 21 a4 06 55 77 91 ac 51 f0 
1d 69 38 22 12
[   74.128396] skb linear:   00000140: 2c 49 1f c9 3c c3 fa 9c d5 fb 87 
9d 16 aa 63 89
[   74.128403] skb linear:   00000150: 1b 8b 34 f7 66 26 32 d5 83 e6 e7 
15 eb 72 32 a4
[   74.128411] skb linear:   00000160: 2a 3a 92 9c 3d 50 a1 ba 3e 7a df 
12 43 85 b1 01
[   74.128418] skb linear:   00000170: 83 dc aa 64 ba 59 08 07 cf 5a 82 
61 b4 18 41 7e
[   74.128426] skb linear:   00000180: 8f 34 2c 3c 17 93 68 ba 40 6c 1f 
0e 1a 9f 81 36
[   74.128434] skb linear:   00000190: f6 49 09 51 cc 95 02 10 d9 d5 49 
67 8c d1 54 88
[   74.128442] skb linear:   000001a0: a3 5e 73 11 92 33 56 84 24 f9 d0 
f9 64 a1 da 0f
[   74.128449] skb linear:   000001b0: be fa db 28 62 83 27 d6 e9 7e c5 
90 3b 45 75 aa
[   74.128457] skb linear:   000001c0: b0 e1 f1 84 75 d9 74 01 32 48 79 
3a e9 32 c5 74
[   74.128465] skb linear:   000001d0: 22 18 a7 50 45 ca 7f 42 47 7d 7d 
44 88 1d ab cc
[   74.128472] skb linear:   000001e0: fc e5 2e fb 8a 2c c9 17 b1 82 a2 
3b 71 fb 49 4d
[   74.128480] skb linear:   000001f0: 69 cb f6 31 3d 13 12 3c 3a fb f9 
ec 3d 01 ff d6
[   74.128488] skb linear:   00000200: d0 91 b1 df 97 d5 5d af eb ce d4 
63 c4 a4 6e 82
[   74.128496] skb linear:   00000210: dc 3a 4f 33 11 06 e9 ad 0b 20 c2 
ee 20 98 77 b0
[   74.128504] skb linear:   00000220: 74
[   74.128511] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00
[   74.128519] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00
[   74.128527] skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00
[   74.128534] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 
00 00
[   74.128545] CPU: 26 UID: 0 PID: 0 Comm: swapper/26 Tainted: G         
  OE     -------  ---  6.14.0-63.fc42.x86_64 #1
[   74.128554] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[   74.128557] Hardware name: To Be Filled By O.E.M. To Be Filled By 
O.E.M./EPYCD8, BIOS L2.52 11/25/2020
[   74.128562] Call Trace:
[   74.128567]  <IRQ>
[   74.128579]  dump_stack_lvl+0x5d/0x80
[   74.128594]  __skb_checksum_complete+0xe8/0x100
[   74.128605]  ? __pfx_csum_partial_ext+0x10/0x10
[   74.128611]  ? __pfx_csum_block_add_ext+0x10/0x10
[   74.128620]  tcp_rcv_established+0x4da/0x770
[   74.128634]  tcp_v4_do_rcv+0x165/0x2b0
[   74.128643]  tcp_v4_rcv+0xc72/0xf40
[   74.128655]  ip_protocol_deliver_rcu+0x33/0x190
[   74.128664]  ip_local_deliver_finish+0x76/0xa0
[   74.128671]  ip_local_deliver+0xf6/0x100
[   74.128682]  __netif_receive_skb_one_core+0x87/0xa0
[   74.128693]  process_backlog+0x87/0x130
[   74.128703]  __napi_poll+0x2b/0x160
[   74.128713]  net_rx_action+0x333/0x420
[   74.128737]  handle_softirqs+0xf2/0x340
[   74.128747]  ? srso_return_thunk+0x5/0x5f
[   74.128760]  __irq_exit_rcu+0xc2/0xe0
[   74.128768]  common_interrupt+0x85/0xa0
[   74.128777]  </IRQ>
[   74.128779]  <TASK>
[   74.128783]  asm_common_interrupt+0x26/0x40
[   74.128792] RIP: 0010:cpuidle_enter_state+0xcc/0x660
[   74.128799] Code: 00 00 e8 d7 23 00 ff e8 62 ee ff ff 49 89 c4 0f 1f 
44 00 00 31 ff e8 03 6c fe fe 45 84 ff 0f 85 02 02 00 00 fb 0f 1f 44 00 
00 <85> ed 0f 88 d3 01 00 00 4c 63 f5 49 83 fe 0a 0f 83 9f 04 00 00 49
[   74.128803] RSP: 0018:ffffb8bd003dfe58 EFLAGS: 00000246
[   74.128809] RAX: ffff9ecb4cd00000 RBX: ffff9eac82e5a800 RCX: 
0000000000000000
[   74.128813] RDX: 00000011426107f1 RSI: 000000003152c088 RDI: 
0000000000000000
[   74.128817] RBP: 0000000000000002 R08: 00000000000d5a5c R09: 
0000000000000001
[   74.128820] R10: 0000000000000003 R11: ffff9ecb4cd217c0 R12: 
00000011426107f1
[   74.128823] R13: ffffffffb8b15140 R14: 0000000000000002 R15: 
0000000000000000
[   74.128841]  ? cpuidle_enter_state+0xbd/0x660
[   74.128853]  cpuidle_enter+0x2d/0x40
[   74.128864]  cpuidle_idle_call+0xf2/0x160
[   74.128875]  do_idle+0x78/0xd0
[   74.128883]  cpu_startup_entry+0x29/0x30
[   74.128890]  start_secondary+0x12d/0x160
[   74.128901]  common_startup_64+0x13e/0x141
[   74.128918]  </TASK>

What caught my attention is that iif=3 points to an interface that is 
not connected to the outside and, as far as I can tell, should not be a 
source of any errors.

Through testing, I've observed the following:

 1. Disabling all Podman containers eliminates the warning.
 2. Disabling only containers using macvlan/ipvlan (while leaving others
    running) still triggers the warning.
 3. Booting with a limited number of containers also reproduces the
    warning — the example above was captured in such a scenario.

The skb dump includes this line:

skb headroom: 00000040: 72 30 8d ae 4f 32 e2 a4 be b5 59 db 08 00 45 00

This appears to show the MAC address of the skb, which I was able to 
trace to:
$ sudo podman exec -it systemd-qbittorrentEH ip a
[... unrelated ...]
3: eth0@...2: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc 
noqueue state UP qlen 1000
     link/ether 72:30:8d:ae:4f:32 brd ff:ff:ff:ff:ff:ff
     inet 10.88.0.4/16 brd 10.88.255.255 scope global eth0
        valid_lft forever preferred_lft forever
     inet6 fccc::4/64 scope global
        valid_lft forever preferred_lft forever
     inet6 fe80::7030:8dff:feae:4f32/64 scope link
        valid_lft forever preferred_lft forever
And the other MAC:
$ ip link show podman0
9: podman0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 
state UP mode DEFAULT group default qlen 1000
     link/ether e2:a4:be:b5:59:db brd ff:ff:ff:ff:ff:ff

This seems to suggest the warning involves traffic between a veth pair 
used by containers, raising the possibility of a bug in the kernel.

For completeness, here is NIC information from the system (2x ConnectX-4 
MCX4121A-ACAT):
$ ethtool -i mlx-p0
driver: mlx5_core
version: 6.14.0-63.fc42.x86_64
firmware-version: 14.32.1900 (MT_2420110034)
expansion-rom-version:
bus-info: 0000:c1:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

(and 2 unplugged i350 ports)
$ ethtool -i board-p0
driver: igb
version: 6.14.0-63.fc42.x86_64
firmware-version: 1.69, 0x80000df4
expansion-rom-version:
bus-info: 0000:45:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

Please let me know if further debugging or logs would be helpful. I'd be 
happy to provide more detail or try any suggested patches.

Best,
Qiyu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ