[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89i+aMsdJ+oQmo1y4sbjJM40NJpFNYn2YT5Gf1WrfMc1nOg@mail.gmail.com>
Date: Thu, 26 Jun 2025 07:36:58 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Kuniyuki Iwashima <kuni1840@...il.com>
Cc: "David S. Miller" <davem@...emloft.net>, David Ahern <dsahern@...nel.org>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Kuniyuki Iwashima <kuniyu@...gle.com>, netdev@...r.kernel.org
Subject: Re: [PATCH v2 net-next 06/15] ipv6: mcast: Don't hold RTNL for
IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.
On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@...il.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@...gle.com>
>
> In __ipv6_sock_mc_join(), per-socket mld data is protected by lock_sock(),
> and only __dev_get_by_index() requires RTNL.
>
> Let's use dev_get_by_index() and drop RTNL for IPV6_ADD_MEMBERSHIP and
> MCAST_JOIN_GROUP.
>
> Note that we must call rt6_lookup() and dev_hold() under RCU.
>
> If rt6_lookup() returns an entry from the exception table, dst_dev_put()
> could change rt->dev.dst to loopback concurrently, and the original device
> could lose the refcount before dev_hold() and unblock device registration.
>
> dst_dev_put() is called from NETDEV_UNREGISTER and synchronize_net() follows
> it, so as long as rt6_lookup() and dev_hold() are called within the same
> RCU critical section, the dev is alive.
>
> Even if the race happens, they are synchronised by idev->dead and mcast
> addresses are cleaned up.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@...gle.com>
> ---
> v2: Hold rcu_read_lock() around rt6_lookup & dev_hold()
> ---
> net/ipv6/ipv6_sockglue.c | 2 --
> net/ipv6/mcast.c | 22 ++++++++++++----------
> 2 files changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> index 1e225e6489ea..cb0dc885cbe4 100644
> --- a/net/ipv6/ipv6_sockglue.c
> +++ b/net/ipv6/ipv6_sockglue.c
> @@ -121,11 +121,9 @@ static bool setsockopt_needs_rtnl(int optname)
> {
> switch (optname) {
> case IPV6_ADDRFORM:
> - case IPV6_ADD_MEMBERSHIP:
> case IPV6_DROP_MEMBERSHIP:
> case IPV6_JOIN_ANYCAST:
> case IPV6_LEAVE_ANYCAST:
> - case MCAST_JOIN_GROUP:
> case MCAST_LEAVE_GROUP:
> case MCAST_JOIN_SOURCE_GROUP:
> case MCAST_LEAVE_SOURCE_GROUP:
> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> index b3f063b5ffd7..9fc7672926bf 100644
> --- a/net/ipv6/mcast.c
> +++ b/net/ipv6/mcast.c
> @@ -175,14 +175,12 @@ static int unsolicited_report_interval(struct inet6_dev *idev)
> static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
> const struct in6_addr *addr, unsigned int mode)
> {
> - struct net_device *dev = NULL;
> - struct ipv6_mc_socklist *mc_lst;
> struct ipv6_pinfo *np = inet6_sk(sk);
> + struct ipv6_mc_socklist *mc_lst;
> struct net *net = sock_net(sk);
> + struct net_device *dev = NULL;
> int err;
>
> - ASSERT_RTNL();
> -
> if (!ipv6_addr_is_multicast(addr))
> return -EINVAL;
>
> @@ -202,13 +200,18 @@ static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
>
> if (ifindex == 0) {
> struct rt6_info *rt;
> +
> + rcu_read_lock();
> rt = rt6_lookup(net, addr, NULL, 0, NULL, 0);
> if (rt) {
> dev = rt->dst.dev;
We probably need safety here, READ_ONCE() at minimum.
This can probably be done in a separate series.
Reviewed-by: Eric Dumazet <edumazet@...gle.com>
Powered by blists - more mailing lists