netdev - Re: [PATCH v5 net-next 3/3] ipv4/udp: Add 4-tuple hash for connected socket

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <80fbd73f-ce75-44bf-a444-116217a50c91@linux.alibaba.com>
Date: Fri, 25 Oct 2024 11:50:42 +0800
From: Philo Lu <lulie@...ux.alibaba.com>
To: Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org
Cc: willemdebruijn.kernel@...il.com, davem@...emloft.net,
 edumazet@...gle.com, kuba@...nel.org, dsahern@...nel.org,
 antony.antony@...unet.com, steffen.klassert@...unet.com,
 linux-kernel@...r.kernel.org, dust.li@...ux.alibaba.com,
 jakub@...udflare.com, fred.cc@...baba-inc.com,
 yubing.qiuyubing@...baba-inc.com
Subject: Re: [PATCH v5 net-next 3/3] ipv4/udp: Add 4-tuple hash for connected
 socket



On 2024/10/24 23:01, Paolo Abeni wrote:
> On 10/18/24 13:45, Philo Lu wrote:
> [...]
>> +/* In hash4, rehash can also happen in connect(), where hash4_cnt keeps unchanged. */
>> +static void udp4_rehash4(struct udp_table *udptable, struct sock *sk, u16 newhash4)
>> +{
>> +	struct udp_hslot *hslot4, *nhslot4;
>> +
>> +	hslot4 = udp_hashslot4(udptable, udp_sk(sk)->udp_lrpa_hash);
>> +	nhslot4 = udp_hashslot4(udptable, newhash4);
>> +	udp_sk(sk)->udp_lrpa_hash = newhash4;
>> +
>> +	if (hslot4 != nhslot4) {
>> +		spin_lock_bh(&hslot4->lock);
>> +		hlist_del_init_rcu(&udp_sk(sk)->udp_lrpa_node);
>> +		hslot4->count--;
>> +		spin_unlock_bh(&hslot4->lock);
>> +
>> +		synchronize_rcu();
> 
> This deserve a comment explaining why it's needed. I had to dig in past
> revision to understand it.
> 

Got it. And a short explanation here (see [1] for detail):

Here, we move a node from a hlist to another new one, i.e., update 
node->next from the old hlist to the new hlist. For readers traversing 
the old hlist, if we update node->next just when readers move onto the 
moved node, then the readers also move to the new hlist. This is unexpected.

     Reader(lookup)     Writer(rehash)
     -----------------  ---------------
1. rcu_read_lock()
2. pos = sk;
3.                     hlist_del_init_rcu(sk, old_slot)
4.                     hlist_add_head_rcu(sk, new_slot)
5. pos = pos->next; <=
6. rcu_read_unlock()

[1]
https://lore.kernel.org/all/0fb425e0-5482-4cdf-9dc1-3906751f8f81@linux.alibaba.com/

>> +
>> +		spin_lock_bh(&nhslot4->lock);
>> +		hlist_add_head_rcu(&udp_sk(sk)->udp_lrpa_node, &nhslot4->head);
>> +		nhslot4->count++;
>> +		spin_unlock_bh(&nhslot4->lock);
>> +	}
>> +}
>> +
>> +static void udp4_unhash4(struct udp_table *udptable, struct sock *sk)
>> +{
>> +	struct udp_hslot *hslot2, *hslot4;
>> +
>> +	if (udp_hashed4(sk)) {
>> +		hslot2 = udp_hashslot2(udptable, udp_sk(sk)->udp_portaddr_hash);
>> +		hslot4 = udp_hashslot4(udptable, udp_sk(sk)->udp_lrpa_hash);
>> +
>> +		spin_lock(&hslot4->lock);
>> +		hlist_del_init_rcu(&udp_sk(sk)->udp_lrpa_node);
>> +		hslot4->count--;
>> +		spin_unlock(&hslot4->lock);
>> +
>> +		spin_lock(&hslot2->lock);
>> +		udp_hash4_dec(hslot2);
>> +		spin_unlock(&hslot2->lock);
>> +	}
>> +}
>> +
>> +/* call with sock lock */
>> +static void udp4_hash4(struct sock *sk)
>> +{
>> +	struct udp_hslot *hslot, *hslot2, *hslot4;
>> +	struct net *net = sock_net(sk);
>> +	struct udp_table *udptable;
>> +	unsigned int hash;
>> +
>> +	if (sk_unhashed(sk) || inet_sk(sk)->inet_rcv_saddr == htonl(INADDR_ANY))
>> +		return;
>> +
>> +	hash = udp_ehashfn(net, inet_sk(sk)->inet_rcv_saddr, inet_sk(sk)->inet_num,
>> +			   inet_sk(sk)->inet_daddr, inet_sk(sk)->inet_dport);
>> +
>> +	udptable = net->ipv4.udp_table;
>> +	if (udp_hashed4(sk)) {
>> +		udp4_rehash4(udptable, sk, hash);
> 
> It's unclear to me how we can enter this branch. Also it's unclear why
> here you don't need to call udp_hash4_inc()udp_hash4_dec, too. Why such
> accounting can't be placed in udp4_rehash4()?
> 

It's possible that a connected udp socket _re-connect_ to another remote 
address. Then, because the local address is not changed, hash2 and its 
hash4_cnt keep unchanged. But rehash4 need to be done.

I'll also add a comment here.

> [...]
>> @@ -2031,6 +2180,19 @@ void udp_lib_rehash(struct sock *sk, u16 newhash)
>>   				spin_unlock(&nhslot2->lock);
>>   			}
>>   
>> +			if (udp_hashed4(sk)) {
>> +				udp4_rehash4(udptable, sk, newhash4);
>> +
>> +				if (hslot2 != nhslot2) {
>> +					spin_lock(&hslot2->lock);
>> +					udp_hash4_dec(hslot2);
>> +					spin_unlock(&hslot2->lock);
>> +
>> +					spin_lock(&nhslot2->lock);
>> +					udp_hash4_inc(nhslot2);
>> +					spin_unlock(&nhslot2->lock);
>> +				}
>> +			}
>>   			spin_unlock_bh(&hslot->lock);
> 
> The udp4_rehash4() call above is in atomic context and could end-up
> calling synchronize_rcu() which is a blocking function. You must avoid that.
> 

I see, synchronize_rcu() cannot be used with spinlock. However, I don't 
have a good idea to solve it by now. Do you have any thoughts or 
suggestions?

> Cheers,
> 
> Paolo

Thanks for your reviewing, Paolo. I'll address all your concerns in the 
next version.

-- 
Philo