[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAVpQUDa8w49-mvf4=nAYLKv0aX9hmAt312_0CD+u4nSWWAv3A@mail.gmail.com>
Date: Thu, 26 Jun 2025 18:01:04 -0700
From: Kuniyuki Iwashima <kuniyu@...gle.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Kuniyuki Iwashima <kuni1840@...il.com>, "David S. Miller" <davem@...emloft.net>,
David Ahern <dsahern@...nel.org>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, netdev@...r.kernel.org
Subject: Re: [PATCH v2 net-next 06/15] ipv6: mcast: Don't hold RTNL for
IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.
On Thu, Jun 26, 2025 at 7:37 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@...il.com> wrote:
> >
> > From: Kuniyuki Iwashima <kuniyu@...gle.com>
> >
> > In __ipv6_sock_mc_join(), per-socket mld data is protected by lock_sock(),
> > and only __dev_get_by_index() requires RTNL.
> >
> > Let's use dev_get_by_index() and drop RTNL for IPV6_ADD_MEMBERSHIP and
> > MCAST_JOIN_GROUP.
> >
> > Note that we must call rt6_lookup() and dev_hold() under RCU.
> >
> > If rt6_lookup() returns an entry from the exception table, dst_dev_put()
> > could change rt->dev.dst to loopback concurrently, and the original device
> > could lose the refcount before dev_hold() and unblock device registration.
> >
> > dst_dev_put() is called from NETDEV_UNREGISTER and synchronize_net() follows
> > it, so as long as rt6_lookup() and dev_hold() are called within the same
> > RCU critical section, the dev is alive.
> >
> > Even if the race happens, they are synchronised by idev->dead and mcast
> > addresses are cleaned up.
> >
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@...gle.com>
> > ---
> > v2: Hold rcu_read_lock() around rt6_lookup & dev_hold()
> > ---
> > net/ipv6/ipv6_sockglue.c | 2 --
> > net/ipv6/mcast.c | 22 ++++++++++++----------
> > 2 files changed, 12 insertions(+), 12 deletions(-)
> >
> > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> > index 1e225e6489ea..cb0dc885cbe4 100644
> > --- a/net/ipv6/ipv6_sockglue.c
> > +++ b/net/ipv6/ipv6_sockglue.c
> > @@ -121,11 +121,9 @@ static bool setsockopt_needs_rtnl(int optname)
> > {
> > switch (optname) {
> > case IPV6_ADDRFORM:
> > - case IPV6_ADD_MEMBERSHIP:
> > case IPV6_DROP_MEMBERSHIP:
> > case IPV6_JOIN_ANYCAST:
> > case IPV6_LEAVE_ANYCAST:
> > - case MCAST_JOIN_GROUP:
> > case MCAST_LEAVE_GROUP:
> > case MCAST_JOIN_SOURCE_GROUP:
> > case MCAST_LEAVE_SOURCE_GROUP:
> > diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> > index b3f063b5ffd7..9fc7672926bf 100644
> > --- a/net/ipv6/mcast.c
> > +++ b/net/ipv6/mcast.c
> > @@ -175,14 +175,12 @@ static int unsolicited_report_interval(struct inet6_dev *idev)
> > static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
> > const struct in6_addr *addr, unsigned int mode)
> > {
> > - struct net_device *dev = NULL;
> > - struct ipv6_mc_socklist *mc_lst;
> > struct ipv6_pinfo *np = inet6_sk(sk);
> > + struct ipv6_mc_socklist *mc_lst;
> > struct net *net = sock_net(sk);
> > + struct net_device *dev = NULL;
> > int err;
> >
> > - ASSERT_RTNL();
> > -
> > if (!ipv6_addr_is_multicast(addr))
> > return -EINVAL;
> >
> > @@ -202,13 +200,18 @@ static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
> >
> > if (ifindex == 0) {
> > struct rt6_info *rt;
> > +
> > + rcu_read_lock();
> > rt = rt6_lookup(net, addr, NULL, 0, NULL, 0);
> > if (rt) {
> > dev = rt->dst.dev;
>
> We probably need safety here, READ_ONCE() at minimum.
Will add it and the corresponding WRITE_ONCE() in dst_dev_put().
>
> This can probably be done in a separate series.
I'll post a follow-up for other rt6_lookup() users and dst.dev
users under RCU.
Thanks!
Powered by blists - more mailing lists