[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z4qPjuK3_fQUYLJi@hog>
Date: Fri, 17 Jan 2025 18:12:46 +0100
From: Sabrina Dubroca <sd@...asysnail.net>
To: Antonio Quartulli <antonio@...nvpn.net>
Cc: ryazanov.s.a@...il.com, netdev@...r.kernel.org,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Donald Hunter <donald.hunter@...il.com>,
Shuah Khan <shuah@...nel.org>, Andrew Lunn <andrew+netdev@...n.ch>,
Simon Horman <horms@...nel.org>, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org, Xiao Liang <shaw.leon@...il.com>
Subject: Re: [PATCH net-next v18 20/25] ovpn: implement peer
add/get/dump/delete via netlink
2025-01-17, 13:59:35 +0100, Antonio Quartulli wrote:
> On 17/01/2025 12:48, Sabrina Dubroca wrote:
> > 2025-01-13, 10:31:39 +0100, Antonio Quartulli wrote:
> > > int ovpn_nl_peer_new_doit(struct sk_buff *skb, struct genl_info *info)
> > > {
> > > - return -EOPNOTSUPP;
> > > + struct nlattr *attrs[OVPN_A_PEER_MAX + 1];
> > > + struct ovpn_priv *ovpn = info->user_ptr[0];
> > > + struct ovpn_socket *ovpn_sock;
> > > + struct socket *sock = NULL;
> > > + struct ovpn_peer *peer;
> > > + u32 sockfd, peer_id;
> > > + int ret;
> > > +
> > > + /* peers can only be added when the interface is up and running */
> > > + if (!netif_running(ovpn->dev))
> > > + return -ENETDOWN;
> >
> > Since we're not under rtnl_lock here, the device could go down while
> > we're creating this peer, and we may end up with a down device that
> > has a peer anyway.
>
> hmm, indeed. This means we must hold the rtnl_lock to prevent ending up in
> an inconsistent state.
>
> >
> > I'm not sure what this (and the peer flushing on NETDEV_DOWN) is
> > trying to accomplish. Is it a problem to keep peers when the netdevice
> > is down?
>
> This is the result of my discussion with Sergey that started in v23 5/23:
>
> https://lore.kernel.org/r/netdev/20241029-b4-ovpn-v11-5-de4698c73a25@openvpn.net/
>
> The idea was to match operational state with actual connectivity to peer(s).
>
> Originally I wanted to simply kee the carrier always on, but after further
> discussion (including the meaning of the openvpn option --persist-tun) we
> agreed on following the logic where an UP device has a peer connected (logic
> is slightly different between MP and P2P).
>
> I am not extremely happy with the resulting complexity, but it seemed to be
> blocker for Sergey.
[after re-reading that discussion with Sergey]
I don't understand why "admin does 'ip link set tun0 down'" means "we
should get rid of all peers. For me the carrier situation goes the
other way: no peer, no carrier (as if I unplugged the cable from my
ethernet card), and it's independent of what the user does (ip link
set XXX up/down). You have that with netif_carrier_{on,off}, but
flushing peers when the admin does "ip link set tun0 down" is separate
IMO.
[...]
> > > int ovpn_nl_peer_del_doit(struct sk_buff *skb, struct genl_info *info)
> > > {
> > > - return -EOPNOTSUPP;
> > > + struct nlattr *attrs[OVPN_A_PEER_MAX + 1];
> > > + struct ovpn_priv *ovpn = info->user_ptr[0];
> > > + struct ovpn_peer *peer;
> > > + u32 peer_id;
> > > + int ret;
> > > +
> > > + if (GENL_REQ_ATTR_CHECK(info, OVPN_A_PEER))
> > > + return -EINVAL;
> > > +
> > > + ret = nla_parse_nested(attrs, OVPN_A_PEER_MAX, info->attrs[OVPN_A_PEER],
> > > + ovpn_peer_nl_policy, info->extack);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + if (NL_REQ_ATTR_CHECK(info->extack, info->attrs[OVPN_A_PEER], attrs,
> > > + OVPN_A_PEER_ID))
> > > + return -EINVAL;
> > > +
> > > + peer_id = nla_get_u32(attrs[OVPN_A_PEER_ID]);
> > > + peer = ovpn_peer_get_by_id(ovpn, peer_id);
> > > + if (!peer) {
> > > + NL_SET_ERR_MSG_FMT_MOD(info->extack,
> > > + "cannot find peer with id %u", peer_id);
> > > + return -ENOENT;
> > > + }
> > > +
> > > + netdev_dbg(ovpn->dev, "del peer %u\n", peer->id);
> > > + ret = ovpn_peer_del(peer, OVPN_DEL_PEER_REASON_USERSPACE);
> >
> > With the delayed socket release (which is similar to what was in v11,
> > but now with refcounting on the netdevice which should make
> > rtnl_link_unregister in ovpn_cleanup wait [*]), we may return to
> > userspace as if the peer was gone, but the socket hasn't been detached
> > yet.
> >
> > A userspace application that tries to remove the peer and immediately
> > re-create it with the same socket could get EBUSY if the workqueue
> > hasn't done its job yet. That would be quite confusing to the
> > application.
>
> This may happen only for TCP, because in the UDP case we would increase the
> refcounter and keep the socket attached.
Not if we're re-attaching to a different ovpn instance/netdevice.
>
> However, re-attaching the same TCP socket is hardly going to happen (in TCP
> we have one socket per peer, therefore if the peer is going away, we're most
> likely killing the socket too).
>
> This said, the complexity added by the completion seems quite tiny,
> therefore I'll add the code you are suggesting below.
Ok.
--
Sabrina
Powered by blists - more mailing lists