netdev - Re: [PATCH net-next 0/2] TLS TX HW offload for Bond

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7056d516-20a5-113b-ca04-b32d326f7742@gmail.com>
Date:   Tue, 24 Nov 2020 17:08:42 +0200
From:   Tariq Toukan <ttoukan.linux@...il.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Jarod Wilson <jarod@...hat.com>, Tariq Toukan <tariqt@...dia.com>,
        "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
        Saeed Mahameed <saeedm@...dia.com>,
        Moshe Shemesh <moshe@...dia.com>,
        Maor Gottlieb <maorg@...dia.com>
Subject: Re: [PATCH net-next 0/2] TLS TX HW offload for Bond



On 11/23/2020 8:20 PM, Jakub Kicinski wrote:
> On Sun, 22 Nov 2020 14:48:04 +0200 Tariq Toukan wrote:
>> On 11/19/2020 6:38 PM, Jakub Kicinski wrote:
>>> On Thu, 19 Nov 2020 17:59:38 +0200 Tariq Toukan wrote:
>>>> On 11/19/2020 2:02 AM, Jakub Kicinski wrote:
>>>>> On Sun, 15 Nov 2020 15:42:49 +0200 Tariq Toukan wrote:
>>>>>> This series opens TLS TX HW offload for bond interfaces.
>>>>>> This allows bond interfaces to benefit from capable slave devices.
>>>>>>
>>>>>> The first patch adds real_dev field in TLS context structure, and aligns
>>>>>> usages in TLS module and supporting drivers.
>>>>>> The second patch opens the offload for bond interfaces.
>>>>>>
>>>>>> For the configuration above, SW kTLS keeps picking the same slave
>>>>>> To keep simple track of the HW and SW TLS contexts, we bind each socket to
>>>>>> a specific slave for the socket's whole lifetime. This is logically valid
>>>>>> (and similar to the SW kTLS behavior) in the following bond configuration,
>>>>>> so we restrict the offload support to it:
>>>>>>
>>>>>> ((mode == balance-xor) or (mode == 802.3ad))
>>>>>> and xmit_hash_policy == layer3+4.
>>>>>
>>>>> This does not feel extremely clean, maybe you can convince me otherwise.
>>>>>
>>>>> Can we extend netdev_get_xmit_slave() and figure out the output dev
>>>>> (and if it's "stable") in a more generic way? And just feed that dev
>>>>> into TLS handling?
>>>>
>>>> I don't see we go through netdev_get_xmit_slave(), but through
>>>> .ndo_start_xmit (bond_start_xmit).
>>>
>>> I may be misunderstanding the purpose of netdev_get_xmit_slave(),
>>> please correct me if I'm wrong. AFAIU it's supposed to return a
>>> lower netdev that the skb should then be xmited on.
>>
>> That's true. It was recently added and used by the RDMA team. Not used
>> or integrated in the Eth networking stack.
>>
>>> So what I was thinking was either construct an skb or somehow reshuffle
>>> the netdev_get_xmit_slave() code to take a flow dissector output or
>>> ${insert other ideas}. Then add a helper in the core that would drill
>>> down from the socket netdev to the "egress" netdev. Have TLS call
>>> that helper, and talk to the "egress" netdev from the start, rather
>>> than the socket's netdev. Then loosen the checks on software devices.
>>
>> As I understand it, best if we can even generalize this to apply to all
>> kinds of traffic: bond driver won't do the xmit itself anymore, it just
>> picks an egress dev and returns it. The core infrastructure will call
>> the xmit function for the egress dev.
> 
> I think you went way further than I was intending :) I was only
> considering the control path. Leave the datapath unchanged.
> 
> AFAIK you're making 3 changes:
>   - forwarding tls ops
>   - pinning flows
>   - handling features
> 
> Pinning of the TLS device to a leg of the bond looks like ~15LoC.
> I think we can live with that.
> 

Good.
You mean the changes under __bond_start_xmit ?

> It's the 150 LoC of forwarding TLS ops and duplicating dev selection
> logic in bond_sk_hash_l34() that I'd rather avoid.
> 

I see.
But there are several issues with this:

1. The .ndo_get_xmit_slave acts on an SKB, not a socket. Hence, it 
doesn't fit for the stage of calling tls_dev_add, unless the ndo goes 
through some refactoring before the feature itself.

2. Existing hash logic acts on an SKB. We must have one that acts on a 
socket to be used inside get_slave(sk). Hence, I don't really see how 
the logic under bond_sk_hash_l34() are going to disappear, maybe just 
move around to a new place.


> Handling features is probably fine, too, I haven't thought about that
> much.
> 

Good.

>> I like the idea, it can generalize code structures for all kinds of
>> upper-devices and sockets, taking them into a common place in core,
>> which reduces code duplications.
>>
>> If we go only half the way, i.e. keep xmit logic in bond for
>> non-TLS-offloaded traffic, then we have to let TLS module (and others in
>> the future) act deferentially for different kinds of devs (upper/lower)
>> which IMHO reduces generality.
> 
> How so? I was expecting TLS to just do something like:
> 
> 	netdev = sk_get_xmit_dev_lowest(sk);
> 
> which would recursively call get_xmit_slave(CONST) until it reaches
> a device which doesn't resolve further.
> 
> BTW is the flow pinning to bond legs actually a must-do? I don't know
> much about bonding but wouldn't that mean that if the selected leg goes
> down we'd lose connectivity, rather than falling back to SW crypto?
> 

Right. As long as we don't have logic for un-offloading.
Currently in TLS, the device-offloaded connections has some 
"independence" once they are created, it's hard to modify them and apply 
configuration modifications to them (example: interaction with tx csum 
offload).
So I think there is a missing un-offloading mechanism in TLS that should 
address all of these together.

This fits with your comments below.

>> I'm in favor of the deeper change. It will be on a larger scale, and
>> totally orthogonal to the current TLS offload support in bond.
>>
>> If we decide to apply the idea only to TLS sockets (or any subset of
>> sockets) we're actually taking a generic one-flow (the xmit patch of a
>> bond dev) and turning it into two (or potentially more) flows, depending
>> on the socket type. This also reduces generality.
> 
> I don't follow this part.
> 
>>> I'm probably missing the problem you're trying to explain to me :S
>>
>> I kept the patch minimal, and kept the TLS offload logic internal to the
>> bond driver, just like it is internal to the device drivers (mlx5e, and
>> others), with no core infrastructure modification.
>>
>>> Side note - Jarod, I'd be happy to take a patch renaming
>>> netdev_get_xmit_slave() and the ndo, if you have the cycles to send
>>> a patch. It's a recent addition, and in the core we should make more
>>> of an effort to avoid sensitive terms.
>>>    
>>>> Currently I have my check there to
>>>> catch all skbs belonging to offloaded TLS sockets.
>>>>
>>>> The TLS offload get_slave() logic decision is per socket, so the result
>>>> cannot be saved in the bond memory. Currently I save the real_dev field
>>>> in the TLS context structure.
>>>
>>> Right, but we could just have ctx->netdev be the "egress" netdev
>>> always, right? Do we expect somewhere that it's going to be matching
>>> the socket's dst?
>>
>> So once the offload context is established we totally bypass the bond
>> dev? and lose interaction or reference to it?
> 
> Yup, I don't think we need it.
> 
>> What if the egress dev is detached form the bond? We must then be
>> notified somehow.
> 
> Do we notify TLS when routing changes? I think it's a separate topic.
> 
> If we have the code to "un-offload" a flow we could handle clearing
> features better and notify from sk_validate_xmit_skb that the flow
> started hitting unexpected dev, hence it should be re-offloaded.
> 
> I don't think we need an explicit invalidation from the particular
> drivers here.
> 

Agree.