linux-kernel - Re: [PATCH net-next v11 05/23] ovpn: keep carrier always on

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <debdfbda-36f8-4c83-bb54-3b48af77e7bd@gmail.com>
Date: Mon, 25 Nov 2024 23:32:24 +0200
From: Sergey Ryazanov <ryazanov.s.a@...il.com>
To: Antonio Quartulli <antonio@...nvpn.net>
Cc: Andrew Lunn <andrew@...n.ch>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 Donald Hunter <donald.hunter@...il.com>, Shuah Khan <shuah@...nel.org>,
 sd@...asysnail.net, netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-kselftest@...r.kernel.org
Subject: Re: [PATCH net-next v11 05/23] ovpn: keep carrier always on

On 25.11.2024 15:07, Antonio Quartulli wrote:
> On 25/11/2024 03:26, Sergey Ryazanov wrote:
>>> OpenVPN (userspace) will tear down the P2P interface upon 
>>> disconnection, assuming the --persist-tun option was not specified by 
>>> the user.
>>>
>>> So the interface is gone in any case.
>>>
>>> By keeping the netcarrier on we are just ensuring that, if the user 
>>> wanted persist-tun, the iface is not actually making decisions on its 
>>> own.
>>
>> Regarding a decision on its own. Ethernet interface going to the not- 
>> running state upon lost of carrier from a switch. It's hardly could be 
>> considered a decision of the interface. It's an indication of the fact.
>>
>> Similarly, beeping of UPS is not its decision to make user's life 
>> miserable, it's the indication of the power line failure. I hope, at 
>> least we are both agree on that a UPS should indicate the line failure.
> 
> The answer is always "it depends".
> 
>> Back to the 'persist-tun' option. I checked the openvpn(8) man page. 
>> It gives a reasonable hints to use this option to avoid negative 
>> outcomes on internal openvpn process restart. E.g. in case of 
>> privilege dropping. It servers the same purpose as 'persist-key'. And 
>> there is no word regarding traffic leaking.
> 
> FTR, here is the text in the manpage:
> 
>         --persist-tun
>                Don't close and reopen TUN/TAP device or run up/down 
> scripts across SIGUSR1 or --ping-restart restarts.
> 
>                SIGUSR1 is a restart signal similar to SIGHUP, but which 
> offers finer-grained control over reset options.
> 
> 
> SIGUSR1 is a session reconnection, not a process restart.
> The manpage just indicates what happens at the low level when this 
> option is provided.

Still no mentions of the traffic leaking prevention. Is it?

> The next question is: what is this useful for? Many things, among those 
> there is the fact the interface will retain its configuration (i.e. IPs, 
> routes, etc).

This is unrelated to the correct operational state indication. Addresses 
and routes are not reset in case of interface going to non-running state.

>> If somebody have decided that this option gives the funny side-effect 
>> and allows to cut the corners, then I cannot say anything but sorry.
> 
> Well, OpenVPN is more than 20 years old.

More than 20 years of misguiding users has been duly noted :)

Should I mention that RFC 1066 containing ifOperStatus definition was 
issues 12 years before OpenVPN? And than it was updated with multiple 
clarifications.

> If a given API allows a specific user behaviour and had done so for 
> those many years, changing it is still a user breakage. Not much we can do.
> 
>>> With a tun interface this can be done, now you want to basically drop 
>>> this feature that existed for long time and break existing setups.
>>
>> Amicus Plato, sed magis amica veritas
>>
>> Yes, I don't want to see this interface misbehaviour advertised as a 
>> security feature. I hope the previous email gives a detailed 
>> explanation why.
> 
> Let's forget about the traffic leak mention and the "security feature". 
> That comment was probably written in the middle of the night and I agree 
> it gives a false sense or what is happening.
> 
>> If it's going to break existing setup, then end-users can be supported 
>> with a changelog notice explaining how to properly address the risk of 
>> the traffic leaking.
> 
> Nope, we can't just break existing user setups.
> 
>>>> At some circumstance, e.g. Android app, it could be the only way to 
>>>> prevent traffic leaking. But these special circumstances do not make 
>>>> solution generic and eligible for inclusion into the mainline code.
>>>
>>> Why not? We are not changing the general rule, but just defining a 
>>> specific behaviour for a specific driver.
>>
>> Yeah. This patch is not changing the general rule. The patch breaks it 
>> and the comment in the code makes proud of it. Looks like an old joke 
>> that documented bug become a feature.
> 
> Like I said above, let's make the comment meaningful for the expected 
> goal: implement persist-tun while leaving userspace the chance to decide 
> what to do.
> 
>>
>>  From a system administrator or a firmware developer perspective, the 
>> proposed behaviour will look like inconsistency comparing to other 
>> interface types. And this inconsistency requires to be addressed with 
>> special configuration or a dedicated script in a worst case. And I 
>> cannot see justified reason to make their life harder.
> 
> You can configure openvpn to bring the interface down when the 
> connection is lost. Why do you say it requires extra scripting and such?

Being administratively down and being operationally down are different 
states.

>>> For example, I don't think a tun interface goes down when there is no 
>>> socket attached to it, still packets are just going to be blackhole'd 
>>> in that case. No?
>>
>> Nope. Tun interface indeed will go into the non-running state on the 
>> detach event. Moreover, the tun module supports running/non-running 
>> indication change upon a command from userspace. But not every 
>> userspace application feel a desire to implement it.
> 
> With 'ovpn' we basically want a similar effect: let userspace decide 
> what to do depending on the configuration.
> 
>>
>>>>> I know it can be implemented in many other different ways..but I 
>>>>> don't see a real problem with keeping this way.
>>>>
>>>> At least routing protocols and network monitoring software will not 
>>>> be happy to see a dead interface pretending that it's still running. 
>>>
>>> They won't know that the interface is disconnected, they will 
>>> possibly just see traffic being dropped.
>>
>> Packet loss detection is quite complex operation. So yes, they are 
>> indeed monitoring the interface operational state to warn operator as 
>> soon as possible and take some automatic actions if we are talking 
>> about routing protocols. Some sophisticated monitoring systems even 
>> capable to generate events like 'link unstable' with higher severity 
>> if they see interface operational state flapping in a short period of 
>> time.
>>
>> So yeah, for these kinds of systems, proper operational state 
>> indication is essential.
> 
> Again, if the user has not explicitly allowed the persistent behaviour, 
> the interface will be brought down when a disconnection happens.
> But if the user/administrator *wants* to avoid that, he has needs a 
> chance to do that.
> 
> Otherwise people that needs this behaviour will just have to stick to 
> using tun and the full userspace implementation.
> 
>>
>>>> Generally speaking, saying that interface is running, when module 
>>>> knows for sure that a packet can not be delivered is a user misguiding.
>>>
>>> Or a feature, wanted by the user.
>>>
>>>>> A blackhole/firewall can still be added if the user prefers (and 
>>>>> not use the persistent interface).
>>>>
>>>> The solution with false-indication is not so reliable as it might 
>>>> look. Interface shutdown, inability of a user-space application to 
>>>> start, user-space application crash, user-space application restart, 
>>>> each of them will void the trick. Ergo, blackhole/firewall must be 
>>>> employed by a security concerned user. What makes the proposed 
>>>> feature odd.
>>>
>>> Yeah, this is what other VPN clients call "kill switch".
>>> Persist-tun is just one piece of the puzzle, yet important.
>>>
>>>>
>>>> To summaries, I'm Ok if this change will be merged with a comment 
>>>> like "For future study" or "To be done" or "To be implemented". But 
>>>> a comment like "to prevent traffic leaking" or any other comment 
>>>> implying a "breakthrough security feature" will have a big NACK from 
>>>> my side.
>>>
>>> What if the comment redirects the user to --persist-tun option in 
>>> order to clarify the context and the wanted behaviour?
>>>
>>> Would that help?
>>
>> Nope. As it was mentioned above, the are no indication that 'persist- 
>> tun' is a 'security' feature even in the current openvpn documentation.
>>
> 
> Like I mentioned above, I agree we should get rid of that sentence.
> The security feature must be implemented by means of extra tools, just 
> the interface staying up is not enough.
> 
>> If the openvpn developers want to keep implementation bug-to-bug 
>> compatible, then feel free to configure the blackhole route on behalf 
>> of end-user by means of the userspace daemon. Nobody will mind.
> 
> bug-to-bug compatible? What do you mean?

http://www.jargon.net/jargonfile/b/bug-compatible.html

With that difference, the local operational state indication does not 
break compatibility between hosts.

> Having userspace configure a blackhole route is something that can be 
> considered by whoeever decides to implement the "kill switch" feature.
> 
> OpenVPN does not. It just implements --persist-tun.
> 
> So all in all, the conclusion is that in this case it's usersapce to 
> decide when the interface should go up and down, depending on the 
> configuration. I'd like to keep it as it is to avoid the ovpn interface 
> to make decisions on its own.
> 
> I can spell this out in the comment (I think it definitely makes sense), 
> to clarify that the netcarrier is expected to be driven by userspace 
> (where the control plane is) rather than having the device make 
> decisions without having the full picture.
> 
> What do you think?

It wasn't suggested to destroy the interface in case of interface 
becoming non-operational. I apologize if something I wrote earlier 
sounded like that. The interface existence stays unquestionable. It's 
going to be solid persistent.

Back to the proposed rephrasing. If the 'full picture' means forcing the 
running state indication even when the netdev is not capable to deliver 
packets, then it looks like an attempt to hide the control knob of the 
misguiding feature somewhere else.

And since the concept of on-purpose false indication is still here, many 
words regarding the control plane and a full picture do not sound good 
either.

--
Sergey