[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZjygvCgXFfrA4GRN@hog>
Date: Thu, 9 May 2024 12:09:00 +0200
From: Sabrina Dubroca <sd@...asysnail.net>
To: Antonio Quartulli <antonio@...nvpn.net>
Cc: netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
Sergey Ryazanov <ryazanov.s.a@...il.com>,
Paolo Abeni <pabeni@...hat.com>, Eric Dumazet <edumazet@...gle.com>,
Andrew Lunn <andrew@...n.ch>, Esben Haabendal <esben@...nix.com>
Subject: Re: [PATCH net-next v3 04/24] ovpn: add basic interface
creation/destruction/management routines
2024-05-09, 10:25:44 +0200, Antonio Quartulli wrote:
> On 08/05/2024 16:52, Sabrina Dubroca wrote:
> > 2024-05-06, 03:16:17 +0200, Antonio Quartulli wrote:
> > > diff --git a/drivers/net/ovpn/main.c b/drivers/net/ovpn/main.c
> > > index 33c0b004ce16..584cd7286aff 100644
> > > --- a/drivers/net/ovpn/main.c
> > > +++ b/drivers/net/ovpn/main.c
> > [...]
> > > +static void ovpn_struct_free(struct net_device *net)
> > > +{
> > > + struct ovpn_struct *ovpn = netdev_priv(net);
> > > +
> > > + rtnl_lock();
> >
> > ->priv_destructor can run from register_netdevice (already under
> > RTNL), this doesn't look right.
> >
> > > + list_del(&ovpn->dev_list);
> >
> > And if this gets called from register_netdevice, the list_add from
> > ovpn_iface_create hasn't run yet, so this will probably do strange
> > things?
>
> Argh, again I haven't considered a failure in register_netdevice and you are
> indeed right.
>
> Maybe it is better to call list_del() in the netdev notifier, upon
> NETDEV_UNREGISTER event?
I'd like to avoid splitting the clean up code over so maybe different
functions and called through different means. Keep it simple.
AFAICT the only reason you need this list is to delete your devices on
netns exit, so if we can get rid of that the list can go away.
> > > +static int ovpn_net_open(struct net_device *dev)
> > > +{
> > > + struct in_device *dev_v4 = __in_dev_get_rtnl(dev);
> > > +
> > > + if (dev_v4) {
> > > + /* disable redirects as Linux gets confused by ovpn handling
> > > + * same-LAN routing
> > > + */
> > > + IN_DEV_CONF_SET(dev_v4, SEND_REDIRECTS, false);
> > > + IPV4_DEVCONF_ALL(dev_net(dev), SEND_REDIRECTS) = false;
> >
> > Jakub, are you ok with that? This feels a bit weird to have in the
> > middle of a driver.
>
> Let me share what the problem is (copied from the email I sent to Andrew
> Lunn as he was also curious about this):
>
> The reason for requiring this setting lies in the OpenVPN server acting as
> relay point (star topology) for hosts in the same subnet.
>
> Example: given the a.b.c.0/24 IP network, you have .2 that in order to talk
> to .3 must have its traffic relayed by .1 (the server).
>
> When the kernel (at .1) sees this traffic it will send the ICMP redirects,
> because it believes that .2 should directly talk to .3 without passing
> through .1.
So only the server would need to stop sending them, not the client?
(or the client would need to ignore them)
But the kernel has no way of knowing if an ovpn device is on a client
or a server?
> Of course it makes sense in a normal network with a classic broadcast
> domain, but this is not the case in a VPN implemented as a star topology.
>
> Does it make sense?
It looks like the problem is that ovpn links are point-to-point
(instead of a broadcast LAN kind of link where redirects would make
sense), and the kernel doesn't handle it that way.
> The only way I see to fix this globally is to have an extra flag in the
> netdevice signaling this peculiarity and thus disabling ICMP redirects
> automatically.
>
> Note: wireguard has those lines too, as it probably needs to address the
> same scenario.
I've noticed a lot of similarities in some bits I've looked at, and I
hate that this is turning into another pile of duplicate code like
vxlan/geneve, bond/team, etc :(
> > [...]
> > > +void ovpn_iface_destruct(struct ovpn_struct *ovpn)
> > > +{
> > > + ASSERT_RTNL();
> > > +
> > > + netif_carrier_off(ovpn->dev);
> > > +
> > > + ovpn->registered = false;
> > > +
> > > + unregister_netdevice(ovpn->dev);
> > > + synchronize_net();
> >
> > If this gets called from the loop in ovpn_netns_pre_exit, one
> > synchronize_net per ovpn device would seem quite expensive.
>
> As per your other comment, maybe I should just remove the synchronize_net()
> entirely since it'll be the core to take care of inflight packets?
There's a synchronize_net in unregister_netdevice_many_notify, so I'd
say you can get rid of it here.
> > > static int ovpn_netdev_notifier_call(struct notifier_block *nb,
> > > unsigned long state, void *ptr)
> > > {
> > > struct net_device *dev = netdev_notifier_info_to_dev(ptr);
> > > + struct ovpn_struct *ovpn;
> > > if (!ovpn_dev_is_valid(dev))
> > > return NOTIFY_DONE;
> > > + ovpn = netdev_priv(dev);
> > > +
> > > switch (state) {
> > > case NETDEV_REGISTER:
> > > - /* add device to internal list for later destruction upon
> > > - * unregistration
> > > - */
> > > + ovpn->registered = true;
> > > break;
> > > case NETDEV_UNREGISTER:
> > > + /* twiddle thumbs on netns device moves */
> > > + if (dev->reg_state != NETREG_UNREGISTERING)
> > > + break;
> > > +
> > > /* can be delivered multiple times, so check registered flag,
> > > * then destroy the interface
> > > */
> > > + if (!ovpn->registered)
> > > + return NOTIFY_DONE;
> > > +
> > > + ovpn_iface_destruct(ovpn);
> >
> > Maybe I'm misunderstanding this code. Why do you want to manually
> > destroy a device that is already going away?
>
> We need to perform some internal cleanup (i.e. release all peers).
> I don't see how this can happen automatically, no?
That's what ->priv_destructor does, and it will be called ultimately
by the unregister_netdevice call you have in ovpn_iface_destruct (in
netdev_run_todo). Anyway, this UNREGISTER event is probably generated
by unregister_netdevice_many_notify (basically a previous
unregister_netdevice() call), so I don't know why you want to call
unregister_netdevice again on the same device.
> > > @@ -62,6 +210,24 @@ static struct notifier_block ovpn_netdev_notifier = {
> > > .notifier_call = ovpn_netdev_notifier_call,
> > > };
> > > +static void ovpn_netns_pre_exit(struct net *net)
> > > +{
> > > + struct ovpn_struct *ovpn;
> > > +
> > > + rtnl_lock();
> > > + list_for_each_entry(ovpn, &dev_list, dev_list) {
> > > + if (dev_net(ovpn->dev) != net)
> > > + continue;
> > > +
> > > + ovpn_iface_destruct(ovpn);
> >
> > Is this needed? On netns destruction all devices within the ns will be
> > destroyed by the networking core.
>
> Before implementing ovpn_netns_pre_exit() this way, upon namespace deletion
> the ovpn interface was being moved to the global namespace.
Crap it's only the devices with ->rtnl_link_ops that get killed by the
core. Because you create your devices via genl (which I'm not a fan
of, even if it's a bit nicer for userspace having a single netlink api
to deal with), default_device_exit_batch/default_device_exit_net think
ovpn devices are real NICs and move them back to init_net instead of
destroying them.
Maybe we can extend the condition in default_device_exit_net with a
new flag so that ovpn devices get destroyed by the core, even without
rtnl_link_ops?
--
Sabrina
Powered by blists - more mailing lists