linux-kernel - Re: [PATCH v5 net-next 3/3] ipv4/udp: Add 4-tuple hash for connected socket

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <c1eca766-d5e7-4fd8-8ffa-9301f060d6c9@linux.alibaba.com>
Date: Sat, 26 Oct 2024 09:39:57 +0800
From: Philo Lu <lulie@...ux.alibaba.com>
To: Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org
Cc: willemdebruijn.kernel@...il.com, davem@...emloft.net,
 edumazet@...gle.com, kuba@...nel.org, dsahern@...nel.org,
 antony.antony@...unet.com, steffen.klassert@...unet.com,
 linux-kernel@...r.kernel.org, dust.li@...ux.alibaba.com,
 jakub@...udflare.com, fred.cc@...baba-inc.com,
 yubing.qiuyubing@...baba-inc.com
Subject: Re: [PATCH v5 net-next 3/3] ipv4/udp: Add 4-tuple hash for connected
 socket



On 2024/10/25 17:02, Paolo Abeni wrote:
> On 10/25/24 05:50, Philo Lu wrote:
>> On 2024/10/24 23:01, Paolo Abeni wrote:
>>> On 10/18/24 13:45, Philo Lu wrote:
>>> [...]
>>>> +/* In hash4, rehash can also happen in connect(), where hash4_cnt keeps unchanged. */
>>>> +static void udp4_rehash4(struct udp_table *udptable, struct sock *sk, u16 newhash4)
>>>> +{
>>>> +	struct udp_hslot *hslot4, *nhslot4;
>>>> +
>>>> +	hslot4 = udp_hashslot4(udptable, udp_sk(sk)->udp_lrpa_hash);
>>>> +	nhslot4 = udp_hashslot4(udptable, newhash4);
>>>> +	udp_sk(sk)->udp_lrpa_hash = newhash4;
>>>> +
>>>> +	if (hslot4 != nhslot4) {
>>>> +		spin_lock_bh(&hslot4->lock);
>>>> +		hlist_del_init_rcu(&udp_sk(sk)->udp_lrpa_node);
>>>> +		hslot4->count--;
>>>> +		spin_unlock_bh(&hslot4->lock);
>>>> +
>>>> +		synchronize_rcu();
>>>
>>> This deserve a comment explaining why it's needed. I had to dig in past
>>> revision to understand it.
>>>
>>
>> Got it. And a short explanation here (see [1] for detail):
>>
>> Here, we move a node from a hlist to another new one, i.e., update
>> node->next from the old hlist to the new hlist. For readers traversing
>> the old hlist, if we update node->next just when readers move onto the
>> moved node, then the readers also move to the new hlist. This is unexpected.
>>
>>       Reader(lookup)     Writer(rehash)
>>       -----------------  ---------------
>> 1. rcu_read_lock()
>> 2. pos = sk;
>> 3.                     hlist_del_init_rcu(sk, old_slot)
>> 4.                     hlist_add_head_rcu(sk, new_slot)
>> 5. pos = pos->next; <=
>> 6. rcu_read_unlock()
>>
>> [1]
>> https://lore.kernel.org/all/0fb425e0-5482-4cdf-9dc1-3906751f8f81@linux.alibaba.com/
> 
> Thanks. AFAICS the problem that such thing could cause is a lookup
> failure for a socket positioned later in the same chain when a previous
> entry is moved on a different slot during a concurrent lookup.
> 

Yes, you're right.

> I think that could be solved the same way TCP is handling such scenario:
> using hlist_null RCU list for the hash4 bucket, checking that a failed
> lookup ends in the same bucket where it started and eventually
> reiterating from the original bucket.
> 
> Have a look at __inet_lookup_established() for a more descriptive
> reference, especially:
> 
> https://elixir.bootlin.com/linux/v6.12-rc4/source/net/ipv4/inet_hashtables.c#L528
> 

Thank you! I'll try it in the next version.

>>>> +
...
>>>> +
>>>> +/* call with sock lock */
>>>> +static void udp4_hash4(struct sock *sk)
>>>> +{
>>>> +	struct udp_hslot *hslot, *hslot2, *hslot4;
>>>> +	struct net *net = sock_net(sk);
>>>> +	struct udp_table *udptable;
>>>> +	unsigned int hash;
>>>> +
>>>> +	if (sk_unhashed(sk) || inet_sk(sk)->inet_rcv_saddr == htonl(INADDR_ANY))
>>>> +		return;
>>>> +
>>>> +	hash = udp_ehashfn(net, inet_sk(sk)->inet_rcv_saddr, inet_sk(sk)->inet_num,
>>>> +			   inet_sk(sk)->inet_daddr, inet_sk(sk)->inet_dport);
>>>> +
>>>> +	udptable = net->ipv4.udp_table;
>>>> +	if (udp_hashed4(sk)) {
>>>> +		udp4_rehash4(udptable, sk, hash);
>>>
>>> It's unclear to me how we can enter this branch. Also it's unclear why
>>> here you don't need to call udp_hash4_inc()udp_hash4_dec, too. Why such
>>> accounting can't be placed in udp4_rehash4()?
>>>
>>
>> It's possible that a connected udp socket _re-connect_ to another remote
>> address. Then, because the local address is not changed, hash2 and its
>> hash4_cnt keep unchanged. But rehash4 need to be done.
>> I'll also add a comment here.
> 
> Right, UDP socket could actually connect() successfully twice in a row
> without a disconnect in between...
> 
> I almost missed the point that the ipv6 implementation is planned to
> land afterwards.
> 
> I'm sorry, but I think that would be problematic - i.e. if ipv4 support
> will land in 6.13, but ipv6 will not make it - due to time constraints -
> we will have (at least a release with inconsistent behavior between ipv4
> and ipv6. I think it will be better bundle such changes together.
> 

No problem. I can add ipv6 support in the next version too.

Thanks.
-- 
Philo