netdev - Re: [PATCH net-next v11 05/23] ovpn: keep carrier always on

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c62208a4-5396-4116-add1-4ffbc254a09d@gmail.com>
Date: Mon, 25 Nov 2024 04:26:18 +0200
From: Sergey Ryazanov <ryazanov.s.a@...il.com>
To: Antonio Quartulli <antonio@...nvpn.net>
Cc: Andrew Lunn <andrew@...n.ch>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 Donald Hunter <donald.hunter@...il.com>, Shuah Khan <shuah@...nel.org>,
 sd@...asysnail.net, netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-kselftest@...r.kernel.org
Subject: Re: [PATCH net-next v11 05/23] ovpn: keep carrier always on

On 24.11.2024 00:52, Antonio Quartulli wrote:
> On 23/11/2024 23:25, Sergey Ryazanov wrote:
>> On 21.11.2024 23:17, Antonio Quartulli wrote:
>>> On 20/11/2024 23:56, Sergey Ryazanov wrote:
>>>> On 15.11.2024 16:13, Antonio Quartulli wrote:
>>>>> On 09/11/2024 02:11, Sergey Ryazanov wrote:
>>>>>> On 29.10.2024 12:47, Antonio Quartulli wrote:
>>>>>>> An ovpn interface will keep carrier always on and let the user
>>>>>>> decide when an interface should be considered disconnected.
>>>>>>>
>>>>>>> This way, even if an ovpn interface is not connected to any peer,
>>>>>>> it can still retain all IPs and routes and thus prevent any data
>>>>>>> leak.
>>>>>>>
>>>>>>> Signed-off-by: Antonio Quartulli <antonio@...nvpn.net>
>>>>>>> Reviewed-by: Andrew Lunn <andrew@...n.ch>
>>>>>>> ---
>>>>>>>   drivers/net/ovpn/main.c | 7 +++++++
>>>>>>>   1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/net/ovpn/main.c b/drivers/net/ovpn/main.c
>>>>>>> index 
>>>>>>> eead7677b8239eb3c48bb26ca95492d88512b8d4..eaa83a8662e4ac2c758201008268f9633643c0b6 100644
>>>>>>> --- a/drivers/net/ovpn/main.c
>>>>>>> +++ b/drivers/net/ovpn/main.c
>>>>>>> @@ -31,6 +31,13 @@ static void ovpn_struct_free(struct net_device 
>>>>>>> *net)
>>>>>>>   static int ovpn_net_open(struct net_device *dev)
>>>>>>>   {
>>>>>>> +    /* ovpn keeps the carrier always on to avoid losing IP or route
>>>>>>> +     * configuration upon disconnection. This way it can prevent 
>>>>>>> leaks
>>>>>>> +     * of traffic outside of the VPN tunnel.
>>>>>>> +     * The user may override this behaviour by tearing down the 
>>>>>>> interface
>>>>>>> +     * manually.
>>>>>>> +     */
>>>>>>> +    netif_carrier_on(dev);
>>>>>>
>>>>>> If a user cares about the traffic leaking, then he can create a 
>>>>>> blackhole route with huge metric:
>>>>>>
>>>>>> # ip route add blackhole default metric 10000
>>>>>>
>>>>>> Why the network interface should implicitly provide this 
>>>>>> functionality? And on another hand, how a routing daemon can learn 
>>>>>> a topology change without indication from the interface?
>>>>>
>>>>> This was discussed loooong ago with Andrew. Here my last response:
>>>>>
>>>>> https://lore.kernel.org/all/d896bbd8-2709-4834-a637- 
>>>>> f982fc51fc57@...nvpn.net/
>>>>
>>>> Thank you for sharing the link to the beginning of the conversation. 
>>>> Till the moment we have 3 topics regarding the operational state 
>>>> indication:
>>>> 1. possible absence of a conception of running state,
>>>> 2. influence on routing protocol implementations,
>>>> 3. traffic leaking.
>>>>
>>>> As for conception of the running state, it should exists for 
>>>> tunneling protocols with a state tracking. In this specific case, we 
>>>> can assume interface running when it has configured peer with keys. 
>>>> The protocol even has nice feature for the connection monitoring - 
>>>> keepalive.
>>>
>>> What about a device in MP mode? It doesn't make sense to turn the 
>>> carrier off when the MP node has no peers connected.
>>> At the same time I don't like having P2P and MP devices behaving 
>>> differently in this regard.
>>
>> MP with a single network interface is an endless headache. Indeed. On 
>> the other hand, penalizing P2P users just because protocol support MP 
>> doesn't look like a solution either.
> 
> On the upper side, with "iroutes" implemented using the system routing 
> table, routing protocols will be able to detect new routes only when the 
> related client has connected. (The same for the disconnection)
> 
> But this is a bit orthogonal compared to the oper state.

The patch has nothing common with the routes configuration. The main 
concern is forcing of the running state indication. And more 
specifically, the concern is the given justification for this activity.

>>> Therefore keeping the carrier on seemed the most logical way forward 
>>> (at least for now - we can still come back to this once we have 
>>> something smarter to implement).
>>
>> It was shown above how to distinguish between running and non-running 
>> cases.
>>
>> If an author doesn't want to implement operational state indication 
>> now, then I'm Ok with it. Not a big deal now. I just don't like the 
>> idea to promote the abuse of the running state indicator. Please see 
>> below.
>>
>>>> Routing protocols on one hand could benefit from the operational 
>>>> state indication. On another hand, hello/hold timer values mentioned 
>>>> in the documentation are comparable with default routing protocols 
>>>> timers. So, actual improvement is debatable.
>>>>
>>>> Regarding the traffic leading, as I mentioned before, the blackhole 
>>>> route or a firewall rule works better then implicit blackholing with 
>>>> a non-running interface.
>>>>
>>>> Long story short, I agree that we might not need a real operational 
>>>> state indication now. Still protecting from a traffic leaking is not 
>>>> good enough justification.
>>>
>>> Well, it's the so called "persistent interface" concept in VPNs: 
>>> leave everything as is, even if the connection is lost.
>>
>> It's called routing framework abuse. The IP router will choose the 
>> route and the egress interface not because this route is a good option 
>> to deliver a packet, but because someone trick it.
> 
> This is what the user wants.

Will be happy to see a study on user's preferences.

> OpenVPN (userspace) will tear down the P2P interface upon disconnection, 
> assuming the --persist-tun option was not specified by the user.
> 
> So the interface is gone in any case.
> 
> By keeping the netcarrier on we are just ensuring that, if the user 
> wanted persist-tun, the iface is not actually making decisions on its own.

Regarding a decision on its own. Ethernet interface going to the 
not-running state upon lost of carrier from a switch. It's hardly could 
be considered a decision of the interface. It's an indication of the fact.

Similarly, beeping of UPS is not its decision to make user's life 
miserable, it's the indication of the power line failure. I hope, at 
least we are both agree on that a UPS should indicate the line failure.

Back to the 'persist-tun' option. I checked the openvpn(8) man page. It 
gives a reasonable hints to use this option to avoid negative outcomes 
on internal openvpn process restart. E.g. in case of privilege dropping. 
It servers the same purpose as 'persist-key'. And there is no word 
regarding traffic leaking.

If somebody have decided that this option gives the funny side-effect 
and allows to cut the corners, then I cannot say anything but sorry.

> With a tun interface this can be done, now you want to basically drop 
> this feature that existed for long time and break existing setups.

Amicus Plato, sed magis amica veritas

Yes, I don't want to see this interface misbehaviour advertised as a 
security feature. I hope the previous email gives a detailed explanation 
why.

If it's going to break existing setup, then end-users can be supported 
with a changelog notice explaining how to properly address the risk of 
the traffic leaking.

>> At some circumstance, e.g. Android app, it could be the only way to 
>> prevent traffic leaking. But these special circumstances do not make 
>> solution generic and eligible for inclusion into the mainline code.
> 
> Why not? We are not changing the general rule, but just defining a 
> specific behaviour for a specific driver.

Yeah. This patch is not changing the general rule. The patch breaks it 
and the comment in the code makes proud of it. Looks like an old joke 
that documented bug become a feature.

 From a system administrator or a firmware developer perspective, the 
proposed behaviour will look like inconsistency comparing to other 
interface types. And this inconsistency requires to be addressed with 
special configuration or a dedicated script in a worst case. And I 
cannot see justified reason to make their life harder.

> For example, I don't think a tun interface goes down when there is no 
> socket attached to it, still packets are just going to be blackhole'd in 
> that case. No?

Nope. Tun interface indeed will go into the non-running state on the 
detach event. Moreover, the tun module supports running/non-running 
indication change upon a command from userspace. But not every userspace 
application feel a desire to implement it.

>>> I know it can be implemented in many other different ways..but I 
>>> don't see a real problem with keeping this way.
>>
>> At least routing protocols and network monitoring software will not be 
>> happy to see a dead interface pretending that it's still running. 
> 
> They won't know that the interface is disconnected, they will possibly 
> just see traffic being dropped.

Packet loss detection is quite complex operation. So yes, they are 
indeed monitoring the interface operational state to warn operator as 
soon as possible and take some automatic actions if we are talking about 
routing protocols. Some sophisticated monitoring systems even capable to 
generate events like 'link unstable' with higher severity if they see 
interface operational state flapping in a short period of time.

So yeah, for these kinds of systems, proper operational state indication 
is essential.

>> Generally speaking, saying that interface is running, when module 
>> knows for sure that a packet can not be delivered is a user misguiding.
> 
> Or a feature, wanted by the user.
> 
>>> A blackhole/firewall can still be added if the user prefers (and not 
>>> use the persistent interface).
>>
>> The solution with false-indication is not so reliable as it might 
>> look. Interface shutdown, inability of a user-space application to 
>> start, user-space application crash, user-space application restart, 
>> each of them will void the trick. Ergo, blackhole/firewall must be 
>> employed by a security concerned user. What makes the proposed feature 
>> odd.
> 
> Yeah, this is what other VPN clients call "kill switch".
> Persist-tun is just one piece of the puzzle, yet important.
> 
>>
>> To summaries, I'm Ok if this change will be merged with a comment like 
>> "For future study" or "To be done" or "To be implemented". But a 
>> comment like "to prevent traffic leaking" or any other comment 
>> implying a "breakthrough security feature" will have a big NACK from 
>> my side.
> 
> What if the comment redirects the user to --persist-tun option in order 
> to clarify the context and the wanted behaviour?
> 
> Would that help?

Nope. As it was mentioned above, the are no indication that 
'persist-tun' is a 'security' feature even in the current openvpn 
documentation.

If the openvpn developers want to keep implementation bug-to-bug 
compatible, then feel free to configure the blackhole route on behalf of 
end-user by means of the userspace daemon. Nobody will mind.

--
Sergey