lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b6a40fe0-09da-4b84-aa8d-0dcebf15d822@linux.dev>
Date: Wed, 3 Sep 2025 19:51:07 +0800
From: luoxuanqiang <xuanqiang.luo@...ux.dev>
To: Eric Dumazet <edumazet@...gle.com>
Cc: kuniyu@...gle.com, davem@...emloft.net, kuba@...nel.org,
 kernelxing@...cent.com, netdev@...r.kernel.org,
 Xuanqiang Luo <luoxuanqiang@...inos.cn>
Subject: Re: [PATCH net] inet: Avoid established lookup missing active sk


在 2025/9/3 14:40, Eric Dumazet 写道:
> On Tue, Sep 2, 2025 at 7:46 PM Xuanqiang Luo <xuanqiang.luo@...ux.dev> wrote:
>> From: Xuanqiang Luo <luoxuanqiang@...inos.cn>
>>
>> Since the lookup of sk in ehash is lockless, when one CPU is performing a
>> lookup while another CPU is executing delete and insert operations
>> (deleting reqsk and inserting sk), the lookup CPU may miss either of
>> them, if sk cannot be found, an RST may be sent.
>>
>> The call trace map is drawn as follows:
>>     CPU 0                           CPU 1
>>     -----                           -----
>>                                  spin_lock()
>>                                  sk_nulls_del_node_init_rcu(osk)
>> __inet_lookup_established()
>>                                  __sk_nulls_add_node_rcu(sk, list)
>>                                  spin_unlock()
>>
>> We can try using spin_lock()/spin_unlock() to wait for ehash updates
>> (ensuring all deletions and insertions are completed) after a failed
>> lookup in ehash, then lookup sk again after the update. Since the sk
>> expected to be found is unlikely to encounter the aforementioned scenario
>> multiple times consecutively, we only need one update.
> No need for a lock really...
> - add the new node (with a temporary 'wrong' nulls value),
> - delete the old node
> - replace the nulls value by the expected one.
>
Hi Eric, May I ask if your main purpose is to add a temporary 'wrong' 
nulls to trigger a re-lookup in the lookup, until the real new sk is 
successfully replaced, like the following code (rough code)?

Thanks
Xuanqiang

--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -526,6 +526,8 @@ struct sock *__inet_lookup_established(const struct net *net,
       * not the expected one, we must restart lookup.
       * We probably met an item that was moved to another chain.
       */
+
+    /* Here, temporary NULL values cause a re-lookup. */
      if (get_nulls_value(node) != slot)
          goto begin;
  out:
@@ -684,7 +686,34 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk, bool *found_dup_sk)
  
      spin_lock(lock);
      if (osk) {
+        struct hlist_nulls_node *i, *last = NULL, *n = &sk->sk_nulls_node;
          WARN_ON_ONCE(sk->sk_hash != osk->sk_hash);
+
+        /* find the last node in the list */
+        for (i = list->first; !is_a_nulls(i); i = i->next)
+            last = i;
+
+        if (last) {
+            /* save the original nulls values  */
+            i = last->next;
+
+            /* 1. add the temporary 'wrong' nulls value */
+            rcu_assign_pointer(hlist_nulls_next_rcu(last),
+                       (struct hlist_nulls_node *)NULLS_MARKER(NULL));
+
+            /* 2. delete the old node */
+            ret = sk_nulls_del_node_init_rcu(osk);
+
+            /* 3. add the new node */
+            if (ret) {
+                WRITE_ONCE(n->next, i);
+                n->pprev = &last->next;
+                rcu_assign_pointer(hlist_nulls_next_rcu(last), n);
+            } else {
+                rcu_assign_pointer(hlist_nulls_next_rcu(last), i);
+            }
+            goto unlock;
+        }
          ret = sk_nulls_del_node_init_rcu(osk);
      } else if (found_dup_sk) {
          *found_dup_sk = inet_ehash_lookup_by_sk(sk, list);
@@ -695,6 +724,7 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk, bool *found_dup_sk)
      if (ret)
          __sk_nulls_add_node_rcu(sk, list);
  
+unlock:
      spin_unlock(lock);
  
      return ret;


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ