lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z89ofEgYznYavrx8@krikkit>
Date: Mon, 10 Mar 2025 23:32:28 +0100
From: Sabrina Dubroca <sd@...asysnail.net>
To: Antonio Quartulli <antonio@...nvpn.net>
Cc: netdev@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Donald Hunter <donald.hunter@...il.com>,
	Shuah Khan <shuah@...nel.org>, ryazanov.s.a@...il.com,
	Andrew Lunn <andrew+netdev@...n.ch>,
	Simon Horman <horms@...nel.org>, linux-kernel@...r.kernel.org,
	linux-kselftest@...r.kernel.org, Xiao Liang <shaw.leon@...il.com>
Subject: Re: [PATCH v21 18/24] ovpn: add support for peer floating

Note: as I tried to say at the end of my previous reply, feel free to
ignore this or save it for later. dst_cache_reset on float and change
of remote via netlink peer_modify is the important thing, and I'm not
sure the "local address change on float" can or can't happen.


2025-03-10, 13:57:09 +0100, Antonio Quartulli wrote:
> On 07/03/2025 11:12, Sabrina Dubroca wrote:
> > 2025-03-06, 11:02:50 +0100, Antonio Quartulli wrote:
> > > On 05/03/2025 17:56, Sabrina Dubroca wrote:
> > > > 2025-03-05, 14:14:36 +0100, Antonio Quartulli wrote:
> > > > > On 05/03/2025 12:20, Sabrina Dubroca wrote:
> > > > > > 2025-03-05, 00:19:32 +0100, Antonio Quartulli wrote:
> > > > > > > On 04/03/2025 19:37, Sabrina Dubroca wrote:
> > > > > > > > 2025-03-04, 01:33:48 +0100, Antonio Quartulli wrote:
> > > > > > > > > +void ovpn_peer_endpoints_update(struct ovpn_peer *peer, struct sk_buff *skb)
> > > > > > > > > +{
> > > > > > > > > +	struct hlist_nulls_head *nhead;
> > > > > > > > > +	struct sockaddr_storage ss;
> > > > > > > > > +	const u8 *local_ip = NULL;
> > > > > > > > > +	struct sockaddr_in6 *sa6;
> > > > > > > > > +	struct sockaddr_in *sa;
> > > > > > > > > +	struct ovpn_bind *bind;
> > > > > > > > > +	size_t salen = 0;
> > > > > > > > > +
> > > > > > > > > +	spin_lock_bh(&peer->lock);
> > > > > > > > > +	bind = rcu_dereference_protected(peer->bind,
> > > > > > > > > +					 lockdep_is_held(&peer->lock));
> > > > > > > > > +	if (unlikely(!bind))
> > > > > > > > > +		goto unlock;
> > > > > > > > > +
> > > > > > > > > +	switch (skb->protocol) {
> > > > > > > > > +	case htons(ETH_P_IP):
> > > > > > > > > +		/* float check */
> > > > > > > > > +		if (unlikely(!ovpn_bind_skb_src_match(bind, skb))) {
> > > > > > > > > +			if (bind->remote.in4.sin_family == AF_INET)
> > > > > > > > > +				local_ip = (u8 *)&bind->local;
> > > > > > > > 
> > > > > > > > If I'm reading this correctly, we always reuse the existing local
> > > > > > > > address when we have to re-create the bind, even if it doesn't match
> > > > > > > > the skb? The "local endpoint update" chunk below is doing that, but
> > > > > > > > only if we're keeping the same remote? It'll get updated the next time
> > > > > > > > we receive a packet and call ovpn_peer_endpoints_update.
> > > > > > > > 
> > > > > > > > That might irritate the RPF check on the other side, if we still use
> > > > > > > > our "old" source to talk to the new dest?
> > > > > > > > 
> > > > > > > > > +			sa = (struct sockaddr_in *)&ss;
> > > > > > > > > +			sa->sin_family = AF_INET;
> > > > > > > > > +			sa->sin_addr.s_addr = ip_hdr(skb)->saddr;
> > > > > > > > > +			sa->sin_port = udp_hdr(skb)->source;
> > > > > > > > > +			salen = sizeof(*sa);
> > > > > > > > > +			break;
> > > > > > > 
> > > > > > > I think the issue is simply this 'break' above - by removing it, everything
> > > > > > > should work as expected.
> > > > > > 
> > > > > > Only if the bind was of the correct family? Checking an IPv4 local
> > > > > > address (in the bind) against an IPv6 source address in the packet (or
> > > > > > the other way around) isn't going to work well.
> > > > > 
> > > > > Ah I understand what you mean.
> > > > > 
> > > > > The purpose of "local_ip" is to provide a working local endpoint to be used
> > > > > with the new remote address.
> > > > > However, if the float is switching family we can't re-use the same old local
> > > > > endpoint (hence the check).
> > > > > In this case we'll learn the "new" local address later.
> > > > > 
> > > > > Does it make sense?
> > > > 
> > > > Sure, but we could have learned it immediately from the packet we just
> > > > got, whether we're changing family or not. No need to wait for the
> > > > next RX packet to also learn the new local address.
> > > 
> > > Indeed.
> > > 
> > > > 
> > > > But if we now do a dst_cache_reset with the peer float,
> > > > ovpn_udp*_output will have to do a new route/local address lookup and
> > > > I guess that should clean up the local address stored in the bind, and
> > > > then update the dst_cache with the local address we just found.
> > > 
> > > Right and this may not truly be what we want.
> > > 
> > > If peer X is sending packets to our IP1, we should at least try to reply
> > > from the same address.
> > > 
> > > If we have two IPs, IP1 and IP2, and both can be used to reach peer X, we
> > > should always try to use the one where we received traffic from X in the
> > > first place.
> > 
> > I had a thought that it might not be our prefered address to talk to
> > X, but it would probably be, since we decided to use it (and thus X
> > used it as remote to talk to us).
> 
> I am not sure I follow this sentence: I think you are just confirming what I
> said above (please correct me if I am wrong)?

Yes. This may turn out to be incorrect (see the "some corner cases"
bit below), but let's wait for examples of that.

> > > OTOH hand it is also true that with floating detection on both sides, the
> > > situation will converge quickly, but there might be a reason why X chose IP1
> > > as destination, therefore we should do our best to respect that.
> > 
> > And I guess the primary reason for X to choose IP1 would be "we sent
> > packets to X from IP1".
> 
> Probably. It truly depends on who initiated the connection.
> 
> > 
> > > So, even in case of float, we should still store the local endpoint and
> > > attempt fetching a route that takes that into consideration.
> > > Which I think is what is happening (assuming we reset the dst_cache on
> > > float).
> > 
> > Not at the same time as float, unless ovpn_peer_endpoints_update sets
> > local_ip = ip_hdr(skb)->daddr unconditionally on float?
> > 
> > Otherwise the next route lookup in ovpn_udpX_output will pick whatever
> > source address it wants (which would likely match what's in the
> > received skb during float, so probably fine anyway).
> > 
> 
> But that's what the code just below in ovpn_peer_endpoints_update() does,
> no?

I think this is going a bit in circles :) And possibly completely
irrelevant.

The /* local endpoint update */ is (correctly) skipped in case of
float [because the family may not be right so we can't compare
addresses].

In case of float, I think setting local_ip to the packet's header
would do the right thing is all cases:
- if it matches what's in the old bind, great, we didn't change our
  local IP, just the remote and we could have kept local_ip = &bind->local
- if it doesn't match, we learn it (but that brings the question: why
  did the peer think our address was IP2 and we thought it was IP1? --
  which may be an irrelevant corner case)


> 
>  223                 /* local endpoint update */
>  224                 if (unlikely(bind->local.ipv4.s_addr !=
> ip_hdr(skb)->daddr)) {
> 
> ...
> 
>  229                         bind->local.ipv4.s_addr = ip_hdr(skb)->daddr;
> 
> 
> > > ovpn_udpX_output() will:
> > > * get no rt from the cache
> > > * possibly confirm that saddr is ok
> > > * fetch the new rt using the provided saddr and daddr
> > > * update the cache.
> > > 
> > > That makes sense to me.
> > > Would you agree?
> > 
> > With dst_cache reset on float, yes. As long as we have that, the main
> > behavior seems correct to me. (maybe some corner cases will not be
> > handled optimally, but that can be improved later - which is most
> > likely what I've been discussing in these emails :))
> 
> Yeah :)
> > 
> > [this could be a useful counter to add in the future: number of floats
> > and local address updates - so the user can check if that's increasing
> > "too often", which would indicate something weird is happening]
> 
> ACK, good idea!
> 
> Thanks!
> 
> Ok, I'll probably wait a little more and then prepare v22.

I won't be able to look at it until the week-end. But if you're ready,
you can go ahead and post it anyway.

-- 
Sabrina

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ