[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e45f3257-dc5c-3bcd-2de4-64f478ebb470@ovn.org>
Date: Thu, 11 May 2023 19:54:16 +0200
From: Ilya Maximets <i.maximets@....org>
To: Dumitru Ceara <dceara@...hat.com>, Eric Dumazet <edumazet@...gle.com>
Cc: i.maximets@....org, Antoine Tenart <atenart@...nel.org>,
davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
netdev@...r.kernel.org
Subject: Re: [PATCH net-next 4/4] net: skbuff: fix l4_hash comment
On 5/11/23 15:00, Dumitru Ceara wrote:
> On 5/11/23 14:33, Eric Dumazet wrote:
>> On Thu, May 11, 2023 at 2:10 PM Dumitru Ceara <dceara@...hat.com> wrote:
>>>
>>> Hi Antoine,
>>>
>>> On 5/11/23 11:34, Antoine Tenart wrote:
>>>> Since commit 877d1f6291f8 ("net: Set sk_txhash from a random number")
>>>> sk->sk_txhash is not a canonical 4-tuple hash. sk->sk_txhash is
>>>> used in the TCP Tx path to populate skb->hash, with skb->l4_hash=1.
>>>> With this, skb->l4_hash does not always indicate the hash is a
>>>> "canonical 4-tuple hash over transport ports" but rather a hash from L4
>>>> layer to provide a uniform distribution over flows. Reword the comment
>>>> accordingly, to avoid misunderstandings.
>>>
>>> But AFAIU the hash used to be a canonical 4-tuple hash and was used as
>>> such by other components, e.g., OvS:
>>>
>>> https://elixir.bootlin.com/linux/latest/source/net/openvswitch/actions.c#L1069
>>>
>>> It seems to me at least unfortunate that semantics change without
>>> considering other users. The fact that we now fix the documentation
>>> makes it seem like OvS was wrong to use the skb hash. However, before
>>> 877d1f6291f8 ("net: Set sk_txhash from a random number") it was OK for
>>> OvS to use the skb hash as a canonical 4-tuple hash.
>>>
>>
>> I do not think we can undo stuff that was done back in 2015
>>
>
> I understand. I guess I was kind of grasping at straws in the hope of
> getting a canonical 4-tuple hash.
>
>> Has anyone complained ?
>>
>
> It did go unnoticed for a while but recently we started getting
> (indirect) reports due to the hash changing.
>
> This one is from an upstream OVN (OvS) user:
> https://github.com/ovn-org/ovn/issues/112
>
> This is from an OpenShift (also running OVN/OvS) user:
> https://issues.redhat.com/browse/OCPBUGS-7406
>
>> Note that skb->hash has never been considered as canonical, for obvious reasons.
I guess, the other point here is that it's not an L4 hash either.
It's a random number. So, the documentation will still not be
correct even after the change proposed in this patch.
One other solution to the problem might be to stop setting l4_hash
flag while it's a random number.
One way to not break everything doing that will be to introduce a
new flag, e.g. 'rnd_hash' that will be a hash that is "not related
to packet fields, but provides a uniform distribution over flows".
skb_get_hash() then may return the current hash if it's any of
l4, rnd or sw. That should preserve the current logic across
the kernel code.
But having a new flag, we could introduce a new helper, for example
skb_get_stable_hash() or skb_get_hash_nonrandom() or something like
that, that will be equal to the current version of skb_get_hash(),
i.e. not take the random hash into account.
Affected subsystems (OVS, ECMP, SRv6) can be changed to use that
new function. This way these subsystems will get a software hash
based on the real packet fields, if it was originally random.
This will also preserve ability to use hash provided by the HW,
since it is not normally random.
With that, we'll also not need to have in the API something that has
'L4' in the name and in the docs, but has no relation to packet fields.
It can be argued that the description in the doc doesn't mean that
this hash is computed using L4 packet fields, but it's confusing
regardless and getting overlooked while creating new code, as it
shown by the issues in multiple substystems.
Hope this makes some sense.
Dumitru also had some alternative ideas on how to provide a stable
hash to subsystems that need it, but I'll leave it to him.
Best regards, Ilya Maximets.
>>
>>
>>> Best regards,
>>> Dumitru
>>>
>>>>
>>>> Signed-off-by: Antoine Tenart <atenart@...nel.org>
>>>> ---
>>>> include/linux/skbuff.h | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>>>> index 738776ab8838..f54c84193b23 100644
>>>> --- a/include/linux/skbuff.h
>>>> +++ b/include/linux/skbuff.h
>>>> @@ -791,8 +791,8 @@ typedef unsigned char *sk_buff_data_t;
>>>> * @active_extensions: active extensions (skb_ext_id types)
>>>> * @ndisc_nodetype: router type (from link layer)
>>>> * @ooo_okay: allow the mapping of a socket to a queue to be changed
>>>> - * @l4_hash: indicate hash is a canonical 4-tuple hash over transport
>>>> - * ports.
>>>> + * @l4_hash: indicate hash is from layer 4 and provides a uniform
>>>> + * distribution over flows.
>>>> * @sw_hash: indicates hash was computed in software stack
>>>> * @wifi_acked_valid: wifi_acked was set
>>>> * @wifi_acked: whether frame was acked on wifi or not
>>>
>>
>
Powered by blists - more mailing lists