netdev - Re: [PATCH net] l2tp: Allow duplicate session creation with UDP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0d7f9d7e-e13b-8254-6a90-fc08bade3e16@katalix.com>
Date:   Thu, 30 Jan 2020 10:28:23 +0000
From:   James Chapman <jchapman@...alix.com>
To:     Guillaume Nault <gnault@...hat.com>
Cc:     Tom Parkin <tparkin@...alix.com>,
        Ridge Kennedy <ridgek@...iedtelesis.co.nz>,
        netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH net] l2tp: Allow duplicate session creation with UDP

On 29/01/2020 11:44, Guillaume Nault wrote:
> On Mon, Jan 27, 2020 at 09:25:30AM +0000, James Chapman wrote:
>> On 25/01/2020 11:57, Guillaume Nault wrote:
>>> On Wed, Jan 22, 2020 at 11:55:35AM +0000, James Chapman wrote:
>>>> In my experience, poor L2TP data performance is usually the result of
>>>> MTU config issues causing IP fragmentation in the tunnel. L2TPv3
>>>> ethernet throughput is similar to ethernet bridge throughput in the
>>>> setups that I know of.
>>>>
>>> I said that because I remember I had tested L2TPv3 and VXLAN a few
>>> years ago and I was surprised by the performance gap. I certainly can't
>>> remember the details of the setup, but I'd be very surprised if I had
>>> misconfigured the MTU.
>> Fair enough. I'd be interested in your observations and ideas regarding
>> improving performance at some point. But I suggest keep this thread
>> focused on the session ID scope issue.
>>
> I had started working on the data path more than a year ago, but never
> got far enough to submit anything. I might revive this work if I find
> enough time. But yes, sure, let's focus on the sessions IDs for now.
>
>>>> I can't see how replacing a lookup using a 32-bit hash key with one
>>>> using a 260-bit or more key (128+128+4 for two IPv[46] addresses and
>>>> the session ID) isn't going to hurt performance, let alone the
>>>> per-session memory footprint. In addition, it is changing the scope of
>>>> the session ID away from what is defined in the RFC.
>>>>
>>> I don't see why we'd need to increase the l2tp_session's structure size.
>>> We can already get the 3/5-tuple from the parent's tunnel socket. And
>>> there are some low hanging fruits to pick if one wants to reduce L2TP's
>>> memory footprint.
>>>
>>> From a performance point of view, 3/5-tuple matches are quite common
>>> operations in the networking stack. I don't expect that to be costly
>>> compared to the rest of the L2TP Rx operations. And we certainly have
>>> room to streamline the datapath if necessary.
>> I was assuming the key used for the session ID lookup would be stored
>> with the session so that we wouldn't have to prepare it for each and
>> every packet receive.
>>
> I don't think that we could store the hash in the session structure.
> The tunnel socket could be rebound or reconnected, thus changing the
> 5/3-tuple from under us.
>
> My idea was to lookup the hash bucket using only the session ID, then
> select the session from this bucket by checking both the session ID and
> the 5/3-tuple.

Ah I see. I'm more comfortable with this now.


>>>> I think Linux shouldn't diverge from the spirit of the L2TPv3 RFC
>>>> since the RFC is what implementors code against. Ridge's application
>>>> relies on duplicated L2TPv3 session IDs which are scoped by the UDP
>>>> 5-tuple address. But equally, there may be existing applications out
>>>> there which rely on Linux performing L2TPv3 session lookup by session
>>>> ID alone, as per the RFC. For IP-encap, Linux already does this, but
>>>> not for UDP. What if we get a request to do so for UDP, for
>>>> RFC-compliance? It would be straightforward to do as long as the
>>>> session ID scope isn't restricted by the proposed patch.
>>>>
>>> As long as the external behavior conforms to the RFC, I don't see any
>>> problem. Local applications are still responsible for selecting
>>> session-IDs. I don't see how they could be confused if the kernel
>>> accepted more IDs, especially since that was the original behaviour.
>> But it wouldn't conform with the RFC.
>>
>> RFC3931 says:
>>
>>  The Session ID alone provides the necessary context for all further
>>  packet processing, including the presence, size, and value of the
>>  Cookie, the type of L2-Specific Sublayer, and the type of payload
>>  being tunneled.
>>
>> and also
>>
>>  The data message format for tunneling data packets may be utilized
>>  with or without the L2TP control channel, either via manual
>>  configuration or via other signaling methods to pre-configure or
>>  distribute L2TP session information.
>>
> Since userspace is in charge of selecting the session ID, I still can't
> see how having the kernel accept duplicate IDs goes against the RFC.
> The kernel doesn't assign duplicate IDs on its own. Userspace has full
> control on the IDs and can implement whatever constraint when assigning
> session IDs (even the DOCSIS DEPI way of partioning the session ID
> space).
Perhaps another example might help.

Suppose there's an L2TPv3 app out there today that creates two tunnels
to a peer, one of which is used as a hot-standby backup in case the main
tunnel fails. This system uses separate network interfaces for the
tunnels, e.g. a router using a mobile network as a backup. If the main
tunnel fails, it switches traffic of sessions immediately into the
second tunnel. Userspace is deliberately using the same session IDs in
both tunnels in this case. This would work today for IP-encap, but not
for UDP. However, if the kernel treats session IDs as scoped by 3-tuple,
the application would break. The app would need to be modified to add
each session ID into both tunnels to work again.


>>> I would have to read the RFC with scoped session IDs in mind, but, as
>>> far as I can see, the only things that global session IDs allow which
>>> can't be done with scoped session IDs are:
>>>   * Accepting L2TPoIP sessions to receive L2TPoUDP packets and
>>>     vice-versa.
>>>   * Accepting L2TPv3 packets from peers we're not connected to.
>>>
>>> I don't find any of these to be desirable. Although Tom convinced me
>>> that global session IDs are in the spirit of the RFC, I still don't
>>> think that restricting their scope goes against it in any practical
>>> way. The L2TPv3 control plane requires a two way communication, which
>>> means that the session is bound to a given 3/5-tuple for control
>>> messages. Why would the data plane behave differently?
>> The Cable Labs / DOCSIS DEPI protocol is a good example. It is based on
>> L2TPv3 and uses the L2TPv3 data plane. It treats the session ID as
>> unscoped and not associated with a given tunnel.
>>
> Fair enough. Then we could add a L2TP_ATTR_SCOPE netlink attribute to
> sessions. A global scope would reject the session ID if another session
> already exists with this ID in the same network namespace. Sessions with
> global scope would be looked up solely based on their ID. A non-global
> scope would allow a session ID to be duplicated as long as the 3/5-tuple
> is different and no session uses this ID with global scope.
>
>>> I agree that it looks saner (and simpler) for a control plane to never
>>> assign the same session ID to sessions running over different tunnels,
>>> even if they have different 3/5-tuples. But that's the user space
>>> control plane implementation's responsability to select unique session
>>> IDs in this case. The fact that the kernel uses scoped or global IDs is
>>> irrelevant. For unmanaged tunnels, the administrator has complete
>>> control over the local and remote session IDs and is free to assign
>>> them globally if it wants to, even if the kernel would have accepted
>>> reusing session IDs.
>> I disagree. Using scoped session IDs may break applications that assume
>> RFC behaviour. I mentioned one example where session IDs are used
>> unscoped above.
>>
> I'm sorry, but I still don't understand how could that break any
> existing application.

Does my example of the hot-standby backup tunnel help?


> For L2TPoUDP, session IDs are always looked up in the context of the
> UDP socket. So even though the kernel has stopped accepting duplicate
> IDs, the session IDs remain scoped in practice. And with the
> application being responsible for assigning IDs, I don't see how making
> the kernel less restrictive could break any existing implementation.
> Again, userspace remains in full control for session ID assignment
> policy.
>
> Then we have L2TPoIP, which does the opposite, always looks up sessions
> globally and depends on session IDs being unique in the network
> namespace. But Ridge's patch does not change that. Also, by adding the
> L2TP_ATTR_SCOPE attribute (as defined above), we could keep this
> behaviour (L2TPoIP session could have global scope by default).

I'm looking at this with an end goal of having the UDP rx path later
modified to work the same way as IP-encap currently does. I know Linux
has never worked this way in the L2TPv3 UDP path and no-one has
requested that it does yet, but I think it would improve the
implementation if UDP and IP encap behaved similarly.

L2TP_ATTR_SCOPE would be a good way for the app to select which
behaviour it prefers.


>>>> Could we ask Ridge to submit a new version of his patch which includes
>>>> a knob to enable it?
>>>>
>>> But what would be the default behaviour? If it's "use global IDs", then
>>> we'll keep breaking applications that used to work with older kernels.
>>> Ridge would know how to revert to the ancient behaviour, but other
>>> users would probably never know about the knob. And if we set the
>>> default behaviour to "allow duplicate IDs for L2TPv3oUDP", then the
>>> knob doesn't need to be implemented as part of Ridge's fix. It can
>>> always be added later, if we ever decide to unify session lookups
>>> accross L2TPoUDP and L2TPoIP and that extending the session hash key
>>> proves not to be a practical solution.
>>
>> The default would be the current behaviour: "global IDs". We'll be
>> breaking applications that assume scoped session IDs, yes. But I think
>> the number of these applications will be minimal given the RFC is clear
>> that session IDs are unscoped and the kernel has worked this way for
>> almost 3 years.
>>
>> I think it's important that the kernel continues to treat the L2TPv3
>> session ID as "global".
>>
> I'm uncomfortable with this. 3 years is not that long, it's the typical
> long term support time for community kernels (not even mentioning
> "enterprise" distributions). Also, we have a report showing that the
> current behaviour broke some use cases, while we never had any problem
> reported for the ancient behaviour (which had been in place for 7
> years). 
This is the policy decision. I see pros and cons both ways. But perhaps
it's ok as long as the session ID can be treated as unscoped as a config
option. See later.

> And finally, rejecting duplicate IDs, won't make the session ID
> space global. As I pointed out earlier, L2TPoUDP sessions are still
> going to be scoped in practice, because that's how lookup is done
> currently. So I don't see what would be the benefit of artificially
> limitting the sessions IDs accepted by the kernel (but I agree that
> L2TPoIP session IDs have to remain unique in the network namespace).

I'd like UDP and IP to eventually work the same way.


>> However, there might be an alternative solution to fix this for Ridge's
>> use case that doesn't involve adding 3/5-tuple session ID lookups in the
>> receive path or adding a control knob...
>>
>> My understanding is that Ridge's application uses unmanaged tunnels
>> (like "ip l2tp" does). These use kernel sockets. The netlink tunnel
>> create request does not indicate a valid tunnel socket fd. So we could
>> use scoped session IDs for unmanaged UDP tunnels only. If Ridge's patch
>> were tweaked to allow scoped IDs only for UDP unmanaged tunnels (adding
>> a test for tunnel->fd < 0), managed tunnels would continue to work as
>> they do now and any application that uses unmanaged tunnels would get
>> scoped session IDs. No control knob or 3/5-tuple session ID lookups
>> required.
>>
> Well, I'd prefer to not introduce another subtle behaviour change. What
> does rejecting duplicate IDs bring us if the lookup is still done in
> the context of the socket? If the point is to have RFC compliance, then
> we'd also need to modify the lookup functions.
I agree, it's not ideal. Rejecting duplicate IDs for UDP will allow the
UDP rx path to be modified later to work the same way as IP. So my idea
was to allow for that change to be made later but only for managed
tunnels (sockets created by userspace). My worry with the original patch
is that it suggests that session IDs for UDP are always scoped by the
tunnel so tweaking it to apply only for unmanaged tunnels was a way of
showing this.

However, you've convinced me now that scoping the session ID by
3/5-tuple could work. As long as there's a mechanism that lets
applications choose whether the 3/5-tuple is ignored in the rx path, I'm
ok with it.