[<prev] [next>] [day] [month] [year] [list]
Message-Id: <b7aa237d-e35e-4af7-a4c3-f8315c2f7310@bytedance.com>
Date: Thu, 8 Jan 2026 21:13:01 +0800
From: "Zigit Zo" <zuozhijie@...edance.com>
To: <netdev@...r.kernel.org>
Cc: <linux-kernel@...r.kernel.org>, <bpf@...r.kernel.org>
Subject: Question about RPS hash collisions with IPv6 flow labels
Hello netdev,
We have observed unexpected RPS behavior related to IPv6 flow labels on
5.10/5.15 and would like to ask for advice, on our 5.10 and 5.15 kernels
under the following conditions:
a. virtio-net (no hash offload)
b. RPS enabled, skb_get_hash calculates the hash here
c. IPv6 with default auto_flowlabels enabled
This causes RPS to keep selecting the same CPU with very similar hashes.
This might be a coincidence, but it keeps happening on these machines,
affecting around 10 RX machines. We have selected one RX machine:
xxxx:71b::50 -> yyyy, [flowlabel 0xeaf27] [skb->hash 3568038043] [cpu 79]
xxxx:71d::36 -> yyyy, [flowlabel 0xbf206] [skb->hash 3544518926] [cpu 79]
xxxx:71a::34 -> yyyy, [flowlabel 0x7b6a8] [skb->hash 3538231196] [cpu 79]
xxxx:71d::40 -> yyyy, [flowlabel 0xbd4a4] [skb->hash 3572956790] [cpu 79]
xxxx:71a::37 -> yyyy, [flowlabel 0x5dbe5] [skb->hash 3573425965] [cpu 79]
xxxx:71f::41 -> yyyy, [flowlabel 0x6acdf] [skb->hash 3571406812] [cpu 79]
xxxx:706::22 -> yyyy, [flowlabel 0x124ae] [skb->hash 3541372961] [cpu 79]
xxxx:718::28 -> yyyy, [flowlabel 0x5ca00] [skb->hash 3551598012] [cpu 79]
xxxx:708::29 -> yyyy, [flowlabel 0x1dfa9] [skb->hash 3559424332] [cpu 79]
xxxx:71c::40 -> yyyy, [flowlabel 0xfeb81] [skb->hash 3545152152] [cpu 79]
Most of the connections are long-lived, but even when the flow label is
changed on retransmission, RPS still keeps selecting the same CPU. We are
wondering why this happens. One possibility is that the TX side is running
a rather old kernel which still uses prandom to generate sk_txhash (flow
label), leading to a higher chance of hash collisions. However, we are not
sure about this, so we would like to ask for help:
- Does anyone know how to explain these hash collisions if they are
generated by prandom? Is this very likely to occur, or is it really a
corner case that we hit?
- Linux has limited ability to ignore or override the flow label in RPS
(for performance or security reasons). Are there any ideas or plans to
improve this?
- The flow dissector BPF attach point is somewhat hard to use, especially
for IPv6 with extension headers. We want to remove the flow label from
the keys rather than recomputing the rest of the keys that we are not
interested in. It also affects many other places (we are using the host
network without network namespaces), such as the fib, which we do not
want to touch. A tc BPF program can modify packets to clear the IPv6 flow
label, but this still has a wide impact.
--
Regards,
Powered by blists - more mailing lists