lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 11 May 2023 22:50:32 +0200
From: Dumitru Ceara <dceara@...hat.com>
To: Ilya Maximets <i.maximets@....org>, Eric Dumazet <edumazet@...gle.com>
Cc: Antoine Tenart <atenart@...nel.org>, davem@...emloft.net,
 kuba@...nel.org, pabeni@...hat.com, netdev@...r.kernel.org
Subject: Re: [PATCH net-next 4/4] net: skbuff: fix l4_hash comment

On 5/11/23 19:54, Ilya Maximets wrote:
> On 5/11/23 15:00, Dumitru Ceara wrote:
>> On 5/11/23 14:33, Eric Dumazet wrote:
>>> On Thu, May 11, 2023 at 2:10 PM Dumitru Ceara <dceara@...hat.com> wrote:
>>>>
>>>> Hi Antoine,
>>>>
>>>> On 5/11/23 11:34, Antoine Tenart wrote:
>>>>> Since commit 877d1f6291f8 ("net: Set sk_txhash from a random number")
>>>>> sk->sk_txhash is not a canonical 4-tuple hash. sk->sk_txhash is
>>>>> used in the TCP Tx path to populate skb->hash, with skb->l4_hash=1.
>>>>> With this, skb->l4_hash does not always indicate the hash is a
>>>>> "canonical 4-tuple hash over transport ports" but rather a hash from L4
>>>>> layer to provide a uniform distribution over flows. Reword the comment
>>>>> accordingly, to avoid misunderstandings.
>>>>
>>>> But AFAIU the hash used to be a canonical 4-tuple hash and was used as
>>>> such by other components, e.g., OvS:
>>>>
>>>> https://elixir.bootlin.com/linux/latest/source/net/openvswitch/actions.c#L1069
>>>>
>>>> It seems to me at least unfortunate that semantics change without
>>>> considering other users.  The fact that we now fix the documentation
>>>> makes it seem like OvS was wrong to use the skb hash.  However, before
>>>> 877d1f6291f8 ("net: Set sk_txhash from a random number") it was OK for
>>>> OvS to use the skb hash as a canonical 4-tuple hash.
>>>>
>>>
>>> I do not think we can undo stuff that was done back in 2015
>>>
>>
>> I understand.  I guess I was kind of grasping at straws in the hope of
>> getting a canonical 4-tuple hash.
>>
>>> Has anyone complained ?
>>>
>>
>> It did go unnoticed for a while but recently we started getting
>> (indirect) reports due to the hash changing.
>>
>> This one is from an upstream OVN (OvS) user:
>> https://github.com/ovn-org/ovn/issues/112
>>
>> This is from an OpenShift (also running OVN/OvS) user:
>> https://issues.redhat.com/browse/OCPBUGS-7406
>>

I just realized we need a bit more context here.  It started being a
visible problem after 265f94ff54d6 ("net: Recompute sk_txhash on
negative routing advice") and also after 3acf3ec3f4b0 ("tcp: Change
txhash on every SYN and RTO retransmit") when retransmits started
changing the txhash and implicitly the hash used by OvS.

>>> Note that skb->hash has never been considered as canonical, for obvious reasons.
> 
> I guess, the other point here is that it's not an L4 hash either.
> 
> It's a random number.  So, the documentation will still not be
> correct even after the change proposed in this patch.
> 
> 
> One other solution to the problem might be to stop setting l4_hash
> flag while it's a random number.
> 
> One way to not break everything doing that will be to introduce a
> new flag, e.g. 'rnd_hash' that will be a hash that is "not related
> to packet fields, but provides a uniform distribution over flows".
> 
> skb_get_hash() then may return the current hash if it's any of
> l4, rnd or sw.  That should preserve the current logic across
> the kernel code.
> But having a new flag, we could introduce a new helper, for example
> skb_get_stable_hash() or skb_get_hash_nonrandom() or something like
> that, that will be equal to the current version of skb_get_hash(),
> i.e. not take the random hash into account.
> 
> Affected subsystems (OVS, ECMP, SRv6) can be changed to use that
> new function.  This way these subsystems will get a software hash
> based on the real packet fields, if it was originally random.
> This will also preserve ability to use hash provided by the HW,
> since it is not normally random.
> 
> With that, we'll also not need to have in the API something that has
> 'L4' in the name and in the docs, but has no relation to packet fields.
> It can be argued that the description in the doc doesn't mean that
> this hash is computed using L4 packet fields, but it's confusing
> regardless and getting overlooked while creating new code, as it
> shown by the issues in multiple substystems.
> 
> Hope this makes some sense.
> 
> 
> Dumitru also had some alternative ideas on how to provide a stable
> hash to subsystems that need it, but I'll leave it to him.
> 
What I had in mind is not really a stable hash but a "good enough
alternative".  It's probably "good enough" (at least for OvS/OVN) if the
hash used by OvS doesn't change throughout the lifetime of a TCP session.

Would it be possible to save the original (random) hash that was
generated for a locally terminated TCP session?  E.g., a new field in
'struct sock'.  It would be in essence a random tag associated to the
session that doesn't change throughout the lifetime of the session.
Unlike sk->sk_txhash which changes on retransmit/negative routing advice.

That means OvS doesn't have to compute a stable hash every time it
processes a packet,  It would just access this value through
skb->sk->good_name_for_this_new_tag.  The advantage is that it gives the
appearance of a canonical 4-tuple hash throughout the lifetime of a
session and it doesn't affect any of the use cases that required
877d1f6291f8 ("net: Set sk_txhash from a random number").

I probably missed relevant things but I thought it might be worth
sharing in case the idea has some value.

Regards,
Dumitru

> Best regards, Ilya Maximets.
> 
>>>
>>>
>>>> Best regards,
>>>> Dumitru
>>>>
>>>>>
>>>>> Signed-off-by: Antoine Tenart <atenart@...nel.org>
>>>>> ---
>>>>>  include/linux/skbuff.h | 4 ++--
>>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>>>>> index 738776ab8838..f54c84193b23 100644
>>>>> --- a/include/linux/skbuff.h
>>>>> +++ b/include/linux/skbuff.h
>>>>> @@ -791,8 +791,8 @@ typedef unsigned char *sk_buff_data_t;
>>>>>   *   @active_extensions: active extensions (skb_ext_id types)
>>>>>   *   @ndisc_nodetype: router type (from link layer)
>>>>>   *   @ooo_okay: allow the mapping of a socket to a queue to be changed
>>>>> - *   @l4_hash: indicate hash is a canonical 4-tuple hash over transport
>>>>> - *           ports.
>>>>> + *   @l4_hash: indicate hash is from layer 4 and provides a uniform
>>>>> + *           distribution over flows.
>>>>>   *   @sw_hash: indicates hash was computed in software stack
>>>>>   *   @wifi_acked_valid: wifi_acked was set
>>>>>   *   @wifi_acked: whether frame was acked on wifi or not
>>>>
>>>
>>
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ