lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <b7aa237d-e35e-4af7-a4c3-f8315c2f7310@bytedance.com>
Date: Thu, 8 Jan 2026 21:13:01 +0800
From: "Zigit Zo" <zuozhijie@...edance.com>
To: <netdev@...r.kernel.org>
Cc: <linux-kernel@...r.kernel.org>, <bpf@...r.kernel.org>
Subject: Question about RPS hash collisions with IPv6 flow labels

Hello netdev,

We have observed unexpected RPS behavior related to IPv6 flow labels on
5.10/5.15 and would like to ask for advice, on our 5.10 and 5.15 kernels
under the following conditions:

a. virtio-net (no hash offload)
b. RPS enabled, skb_get_hash calculates the hash here
c. IPv6 with default auto_flowlabels enabled

This causes RPS to keep selecting the same CPU with very similar hashes.
This might be a coincidence, but it keeps happening on these machines,
affecting around 10 RX machines. We have selected one RX machine:

xxxx:71b::50 -> yyyy, [flowlabel 0xeaf27] [skb->hash 3568038043] [cpu 79]
xxxx:71d::36 -> yyyy, [flowlabel 0xbf206] [skb->hash 3544518926] [cpu 79]
xxxx:71a::34 -> yyyy, [flowlabel 0x7b6a8] [skb->hash 3538231196] [cpu 79]
xxxx:71d::40 -> yyyy, [flowlabel 0xbd4a4] [skb->hash 3572956790] [cpu 79]
xxxx:71a::37 -> yyyy, [flowlabel 0x5dbe5] [skb->hash 3573425965] [cpu 79]
xxxx:71f::41 -> yyyy, [flowlabel 0x6acdf] [skb->hash 3571406812] [cpu 79]
xxxx:706::22 -> yyyy, [flowlabel 0x124ae] [skb->hash 3541372961] [cpu 79]
xxxx:718::28 -> yyyy, [flowlabel 0x5ca00] [skb->hash 3551598012] [cpu 79]
xxxx:708::29 -> yyyy, [flowlabel 0x1dfa9] [skb->hash 3559424332] [cpu 79]
xxxx:71c::40 -> yyyy, [flowlabel 0xfeb81] [skb->hash 3545152152] [cpu 79]

Most of the connections are long-lived, but even when the flow label is
changed on retransmission, RPS still keeps selecting the same CPU. We are
wondering why this happens. One possibility is that the TX side is running
a rather old kernel which still uses prandom to generate sk_txhash (flow
label), leading to a higher chance of hash collisions. However, we are not
sure about this, so we would like to ask for help:

- Does anyone know how to explain these hash collisions if they are
  generated by prandom? Is this very likely to occur, or is it really a
  corner case that we hit?

- Linux has limited ability to ignore or override the flow label in RPS
  (for performance or security reasons). Are there any ideas or plans to
  improve this?

- The flow dissector BPF attach point is somewhat hard to use, especially
  for IPv6 with extension headers. We want to remove the flow label from
  the keys rather than recomputing the rest of the keys that we are not
  interested in. It also affects many other places (we are using the host
  network without network namespaces), such as the fib, which we do not
  want to touch. A tc BPF program can modify packets to clear the IPv6 flow
  label, but this still has a wide impact.

-- 
Regards,

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ