[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b6a6c29b-ad78-4d6f-be1a-93615f27c956@openvpn.net>
Date: Thu, 9 May 2024 10:25:44 +0200
From: Antonio Quartulli <antonio@...nvpn.net>
To: Sabrina Dubroca <sd@...asysnail.net>
Cc: netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
Sergey Ryazanov <ryazanov.s.a@...il.com>, Paolo Abeni <pabeni@...hat.com>,
Eric Dumazet <edumazet@...gle.com>, Andrew Lunn <andrew@...n.ch>,
Esben Haabendal <esben@...nix.com>
Subject: Re: [PATCH net-next v3 04/24] ovpn: add basic interface
creation/destruction/management routines
On 08/05/2024 16:52, Sabrina Dubroca wrote:
> 2024-05-06, 03:16:17 +0200, Antonio Quartulli wrote:
>> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
>> index ad3813419c33..338e99dfe886 100644
>> --- a/drivers/net/ovpn/io.c
>> +++ b/drivers/net/ovpn/io.c
>> @@ -11,6 +11,26 @@
>> #include <linux/skbuff.h>
>>
>> #include "io.h"
>> +#include "ovpnstruct.h"
>> +#include "netlink.h"
>> +
>> +int ovpn_struct_init(struct net_device *dev)
>
> nit: Should this be in main.c? It's only used there, and I think it
> would make more sense to drop it next to ovpn_struct_free.
yeah, it makes sense. will move it.
>
>> +{
>> + struct ovpn_struct *ovpn = netdev_priv(dev);
>> + int err;
>> +
>
> [...]
>> diff --git a/drivers/net/ovpn/main.c b/drivers/net/ovpn/main.c
>> index 33c0b004ce16..584cd7286aff 100644
>> --- a/drivers/net/ovpn/main.c
>> +++ b/drivers/net/ovpn/main.c
> [...]
>> +static void ovpn_struct_free(struct net_device *net)
>> +{
>> + struct ovpn_struct *ovpn = netdev_priv(net);
>> +
>> + rtnl_lock();
>
> ->priv_destructor can run from register_netdevice (already under
> RTNL), this doesn't look right.
>
>> + list_del(&ovpn->dev_list);
>
> And if this gets called from register_netdevice, the list_add from
> ovpn_iface_create hasn't run yet, so this will probably do strange
> things?
Argh, again I haven't considered a failure in register_netdevice and you
are indeed right.
Maybe it is better to call list_del() in the netdev notifier, upon
NETDEV_UNREGISTER event?
>
>> + rtnl_unlock();
>> +
>> + free_percpu(net->tstats);
>> +}
>> +
>> +static int ovpn_net_open(struct net_device *dev)
>> +{
>> + struct in_device *dev_v4 = __in_dev_get_rtnl(dev);
>> +
>> + if (dev_v4) {
>> + /* disable redirects as Linux gets confused by ovpn handling
>> + * same-LAN routing
>> + */
>> + IN_DEV_CONF_SET(dev_v4, SEND_REDIRECTS, false);
>> + IPV4_DEVCONF_ALL(dev_net(dev), SEND_REDIRECTS) = false;
>
> Jakub, are you ok with that? This feels a bit weird to have in the
> middle of a driver.
Let me share what the problem is (copied from the email I sent to Andrew
Lunn as he was also curious about this):
The reason for requiring this setting lies in the OpenVPN server acting
as relay point (star topology) for hosts in the same subnet.
Example: given the a.b.c.0/24 IP network, you have .2 that in order to
talk to .3 must have its traffic relayed by .1 (the server).
When the kernel (at .1) sees this traffic it will send the ICMP
redirects, because it believes that .2 should directly talk to .3
without passing through .1.
Of course it makes sense in a normal network with a classic broadcast
domain, but this is not the case in a VPN implemented as a star topology.
Does it make sense?
The only way I see to fix this globally is to have an extra flag in the
netdevice signaling this peculiarity and thus disabling ICMP redirects
automatically.
Note: wireguard has those lines too, as it probably needs to address the
same scenario.
>
>> + }
>> +
>> + netif_tx_start_all_queues(dev);
>> + return 0;
>> +}
>
> [...]
>> +void ovpn_iface_destruct(struct ovpn_struct *ovpn)
>> +{
>> + ASSERT_RTNL();
>> +
>> + netif_carrier_off(ovpn->dev);
>> +
>> + ovpn->registered = false;
>> +
>> + unregister_netdevice(ovpn->dev);
>> + synchronize_net();
>
> If this gets called from the loop in ovpn_netns_pre_exit, one
> synchronize_net per ovpn device would seem quite expensive.
As per your other comment, maybe I should just remove the
synchronize_net() entirely since it'll be the core to take care of
inflight packets?
>
>> +}
>> +
>> static int ovpn_netdev_notifier_call(struct notifier_block *nb,
>> unsigned long state, void *ptr)
>> {
>> struct net_device *dev = netdev_notifier_info_to_dev(ptr);
>> + struct ovpn_struct *ovpn;
>>
>> if (!ovpn_dev_is_valid(dev))
>> return NOTIFY_DONE;
>>
>> + ovpn = netdev_priv(dev);
>> +
>> switch (state) {
>> case NETDEV_REGISTER:
>> - /* add device to internal list for later destruction upon
>> - * unregistration
>> - */
>> + ovpn->registered = true;
>> break;
>> case NETDEV_UNREGISTER:
>> + /* twiddle thumbs on netns device moves */
>> + if (dev->reg_state != NETREG_UNREGISTERING)
>> + break;
>> +
>> /* can be delivered multiple times, so check registered flag,
>> * then destroy the interface
>> */
>> + if (!ovpn->registered)
>> + return NOTIFY_DONE;
>> +
>> + ovpn_iface_destruct(ovpn);
>
> Maybe I'm misunderstanding this code. Why do you want to manually
> destroy a device that is already going away?
We need to perform some internal cleanup (i.e. release all peers).
I don't see how this can happen automatically, no?
>
>> break;
>> case NETDEV_POST_INIT:
>> case NETDEV_GOING_DOWN:
>> case NETDEV_DOWN:
>> case NETDEV_UP:
>> case NETDEV_PRE_UP:
>> + break;
>> default:
>> return NOTIFY_DONE;
>> }
>> @@ -62,6 +210,24 @@ static struct notifier_block ovpn_netdev_notifier = {
>> .notifier_call = ovpn_netdev_notifier_call,
>> };
>>
>> +static void ovpn_netns_pre_exit(struct net *net)
>> +{
>> + struct ovpn_struct *ovpn;
>> +
>> + rtnl_lock();
>> + list_for_each_entry(ovpn, &dev_list, dev_list) {
>> + if (dev_net(ovpn->dev) != net)
>> + continue;
>> +
>> + ovpn_iface_destruct(ovpn);
>
> Is this needed? On netns destruction all devices within the ns will be
> destroyed by the networking core.
Before implementing ovpn_netns_pre_exit() this way, upon namespace
deletion the ovpn interface was being moved to the global namespace.
Hence I decided to manually take care of its destruction.
Isn't this expected?
--
Antonio Quartulli
OpenVPN Inc.
Powered by blists - more mailing lists