lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250712180816.3987876-1-kuniyu@google.com>
Date: Sat, 12 Jul 2025 18:07:51 +0000
From: Kuniyuki Iwashima <kuniyu@...gle.com>
To: horms@...nel.org
Cc: davem@...emloft.net, dsahern@...nel.org, edumazet@...gle.com, 
	kuba@...nel.org, kuni1840@...il.com, kuniyu@...gle.com, 
	netdev@...r.kernel.org, pabeni@...hat.com
Subject: Re: [PATCH v1 net-next 06/14] neighbour: Free pneigh_entry after RCU
 grace period.

From: Simon Horman <horms@...nel.org>
Date: Sat, 12 Jul 2025 16:01:59 +0100
> On Fri, Jul 11, 2025 at 07:06:11PM +0000, Kuniyuki Iwashima wrote:
> > We will convert RTM_GETNEIGH to RCU.
> > 
> > neigh_get() looks up pneigh_entry by pneigh_lookup() and passes
> > it to pneigh_fill_info().
> > 
> > Then, we must ensure that the entry is alive till pneigh_fill_info()
> > completes, but read_lock_bh(&tbl->lock) in pneigh_lookup() does not
> > guarantee that.
> > 
> > Also, we will convert all readers of tbl->phash_buckets[] to RCU.
> > 
> > Let's use call_rcu() to free pneigh_entry and update phash_buckets[]
> > and ->next by rcu_assign_pointer().
> > 
> > pneigh_ifdown_and_unlock() uses list_head to avoid overwriting
> > ->next and moving RCU iterators to another list.
> > 
> > pndisc_destructor() (only IPv6 ndisc uses this) uses a mutex, so it
> > is not delayed to call_rcu(), where we cannot sleep.  This is fine
> > because the mcast code works with RCU and ipv6_dev_mc_dec() frees
> > mcast objects after RCU grace period.
> > 
> > While at it, we change the return type of pneigh_ifdown_and_unlock()
> > to void.
> > 
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@...gle.com>
> > ---
> >  include/net/neighbour.h |  4 ++++
> >  net/core/neighbour.c    | 51 +++++++++++++++++++++++++----------------
> >  2 files changed, 35 insertions(+), 20 deletions(-)
> > 
> > diff --git a/include/net/neighbour.h b/include/net/neighbour.h
> > index 7f3d57da5689a..a877e56210b22 100644
> > --- a/include/net/neighbour.h
> > +++ b/include/net/neighbour.h
> > @@ -180,6 +180,10 @@ struct pneigh_entry {
> >  	possible_net_t		net;
> >  	struct net_device	*dev;
> >  	netdevice_tracker	dev_tracker;
> > +	union {
> > +		struct list_head	free_node;
> > +		struct rcu_head		rcu;
> > +	};
> >  	u32			flags;
> >  	u8			protocol;
> >  	bool			permanent;
> > diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> > index 814a45fb1962e..6725a40b2db3a 100644
> > --- a/net/core/neighbour.c
> > +++ b/net/core/neighbour.c
> > @@ -54,9 +54,9 @@ static void neigh_timer_handler(struct timer_list *t);
> >  static void __neigh_notify(struct neighbour *n, int type, int flags,
> >  			   u32 pid);
> >  static void neigh_update_notify(struct neighbour *neigh, u32 nlmsg_pid);
> > -static int pneigh_ifdown_and_unlock(struct neigh_table *tbl,
> > -				    struct net_device *dev,
> > -				    bool skip_perm);
> > +static void pneigh_ifdown_and_unlock(struct neigh_table *tbl,
> > +				     struct net_device *dev,
> > +				     bool skip_perm);
> >  
> >  #ifdef CONFIG_PROC_FS
> >  static const struct seq_operations neigh_stat_seq_ops;
> > @@ -803,12 +803,20 @@ struct pneigh_entry *pneigh_create(struct neigh_table *tbl,
> >  
> >  	write_lock_bh(&tbl->lock);
> >  	n->next = tbl->phash_buckets[hash_val];
> > -	tbl->phash_buckets[hash_val] = n;
> > +	rcu_assign_pointer(tbl->phash_buckets[hash_val], n);
> 
> Hi Iwashima-san,
> 
> A heads-up that unfortunately Sparse is unhappy about the __rcu annotations
> here, and elsewhere in this patch (set).
> 
> For this patch I see:
> 
>   .../neighbour.c:860:33: error: incompatible types in comparison expression (different address spaces):
>   .../neighbour.c:860:33:    struct pneigh_entry [noderef] __rcu *
>   .../neighbour.c:860:33:    struct pneigh_entry *
>   .../neighbour.c:806:9: error: incompatible types in comparison expression (different address spaces):
>   .../neighbour.c:806:9:    struct pneigh_entry [noderef] __rcu *
>   .../neighbour.c:806:9:    struct pneigh_entry *
>   .../neighbour.c:832:25: error: incompatible types in comparison expression (different address spaces):
>   .../neighbour.c:832:25:    struct pneigh_entry [noderef] __rcu *
>   .../neighbour.c:832:25:    struct pneigh_entry *

Thanks for heads-up, Simon!

This diff below was needed on top of the series, but as I gradually added
rcu_derefernece_check(), probably I need to churn this patch 6 more.

Anyway, I'll fix every annotation warning in v2.

---8<---
diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index 9bc5be41a6d09..f1fd15fbbb800 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -178,7 +178,7 @@ struct neigh_ops {
 };
 
 struct pneigh_entry {
-       struct pneigh_entry     *next;
+       struct pneigh_entry     __rcu *next;
        possible_net_t          net;
        struct net_device       *dev;
        netdevice_tracker       dev_tracker;
@@ -243,7 +243,7 @@ struct neigh_table {
        struct neigh_statistics __percpu *stats;
        struct neigh_hash_table __rcu *nht;
        struct mutex            phash_lock;
-       struct pneigh_entry     **phash_buckets;
+       struct pneigh_entry     __rcu **phash_buckets;
        struct srcu_struct      srcu;
 };
 
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 9bbf6d514abe6..1e8832a3e0176 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -804,17 +804,18 @@ static void pneigh_destroy(struct rcu_head *rcu)
 int pneigh_delete(struct neigh_table *tbl, struct net *net, const void *pkey,
                  struct net_device *dev)
 {
-       struct pneigh_entry *n, **np;
+       struct pneigh_entry *n, __rcu **np;
        unsigned int key_len = tbl->key_len;
        u32 hash_val = pneigh_hash(pkey, key_len);
 
        mutex_lock(&tbl->phash_lock);
 
-       for (np = &tbl->phash_buckets[hash_val]; (n = *np) != NULL;
+       for (np = &tbl->phash_buckets[hash_val];
+            (n = rcu_dereference_protected(*np, 1)) != NULL;
             np = &n->next) {
                if (!memcmp(n->key, pkey, key_len) && n->dev == dev &&
                    net_eq(pneigh_net(n), net)) {
-                       rcu_assign_pointer(*np, n->next);
+                       rcu_assign_pointer(*np, rcu_dereference_protected(n->next, 1));
 
                        mutex_unlock(&tbl->phash_lock);
 
@@ -833,7 +834,7 @@ int pneigh_delete(struct neigh_table *tbl, struct net *net, const void *pkey,
 static void pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev,
                          bool skip_perm)
 {
-       struct pneigh_entry *n, **np;
+       struct pneigh_entry *n, __rcu **np;
        LIST_HEAD(head);
        u32 h;
 
@@ -841,7 +842,7 @@ static void pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev,
 
        for (h = 0; h <= PNEIGH_HASHMASK; h++) {
                np = &tbl->phash_buckets[h];
-               while ((n = *np) != NULL) {
+               while ((n = rcu_dereference_protected(*np, 1)) != NULL) {
                        if (skip_perm && n->permanent)
                                goto skip;
                        if (!dev || n->dev == dev) {
---8<---

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ