[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201123102040.338f32c7@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Mon, 23 Nov 2020 10:20:40 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Tariq Toukan <ttoukan.linux@...il.com>
Cc: Jarod Wilson <jarod@...hat.com>, Tariq Toukan <tariqt@...dia.com>,
"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
Saeed Mahameed <saeedm@...dia.com>,
Moshe Shemesh <moshe@...dia.com>,
Maor Gottlieb <maorg@...dia.com>
Subject: Re: [PATCH net-next 0/2] TLS TX HW offload for Bond
On Sun, 22 Nov 2020 14:48:04 +0200 Tariq Toukan wrote:
> On 11/19/2020 6:38 PM, Jakub Kicinski wrote:
> > On Thu, 19 Nov 2020 17:59:38 +0200 Tariq Toukan wrote:
> >> On 11/19/2020 2:02 AM, Jakub Kicinski wrote:
> >>> On Sun, 15 Nov 2020 15:42:49 +0200 Tariq Toukan wrote:
> >>>> This series opens TLS TX HW offload for bond interfaces.
> >>>> This allows bond interfaces to benefit from capable slave devices.
> >>>>
> >>>> The first patch adds real_dev field in TLS context structure, and aligns
> >>>> usages in TLS module and supporting drivers.
> >>>> The second patch opens the offload for bond interfaces.
> >>>>
> >>>> For the configuration above, SW kTLS keeps picking the same slave
> >>>> To keep simple track of the HW and SW TLS contexts, we bind each socket to
> >>>> a specific slave for the socket's whole lifetime. This is logically valid
> >>>> (and similar to the SW kTLS behavior) in the following bond configuration,
> >>>> so we restrict the offload support to it:
> >>>>
> >>>> ((mode == balance-xor) or (mode == 802.3ad))
> >>>> and xmit_hash_policy == layer3+4.
> >>>
> >>> This does not feel extremely clean, maybe you can convince me otherwise.
> >>>
> >>> Can we extend netdev_get_xmit_slave() and figure out the output dev
> >>> (and if it's "stable") in a more generic way? And just feed that dev
> >>> into TLS handling?
> >>
> >> I don't see we go through netdev_get_xmit_slave(), but through
> >> .ndo_start_xmit (bond_start_xmit).
> >
> > I may be misunderstanding the purpose of netdev_get_xmit_slave(),
> > please correct me if I'm wrong. AFAIU it's supposed to return a
> > lower netdev that the skb should then be xmited on.
>
> That's true. It was recently added and used by the RDMA team. Not used
> or integrated in the Eth networking stack.
>
> > So what I was thinking was either construct an skb or somehow reshuffle
> > the netdev_get_xmit_slave() code to take a flow dissector output or
> > ${insert other ideas}. Then add a helper in the core that would drill
> > down from the socket netdev to the "egress" netdev. Have TLS call
> > that helper, and talk to the "egress" netdev from the start, rather
> > than the socket's netdev. Then loosen the checks on software devices.
>
> As I understand it, best if we can even generalize this to apply to all
> kinds of traffic: bond driver won't do the xmit itself anymore, it just
> picks an egress dev and returns it. The core infrastructure will call
> the xmit function for the egress dev.
I think you went way further than I was intending :) I was only
considering the control path. Leave the datapath unchanged.
AFAIK you're making 3 changes:
- forwarding tls ops
- pinning flows
- handling features
Pinning of the TLS device to a leg of the bond looks like ~15LoC.
I think we can live with that.
It's the 150 LoC of forwarding TLS ops and duplicating dev selection
logic in bond_sk_hash_l34() that I'd rather avoid.
Handling features is probably fine, too, I haven't thought about that
much.
> I like the idea, it can generalize code structures for all kinds of
> upper-devices and sockets, taking them into a common place in core,
> which reduces code duplications.
>
> If we go only half the way, i.e. keep xmit logic in bond for
> non-TLS-offloaded traffic, then we have to let TLS module (and others in
> the future) act deferentially for different kinds of devs (upper/lower)
> which IMHO reduces generality.
How so? I was expecting TLS to just do something like:
netdev = sk_get_xmit_dev_lowest(sk);
which would recursively call get_xmit_slave(CONST) until it reaches
a device which doesn't resolve further.
BTW is the flow pinning to bond legs actually a must-do? I don't know
much about bonding but wouldn't that mean that if the selected leg goes
down we'd lose connectivity, rather than falling back to SW crypto?
> I'm in favor of the deeper change. It will be on a larger scale, and
> totally orthogonal to the current TLS offload support in bond.
>
> If we decide to apply the idea only to TLS sockets (or any subset of
> sockets) we're actually taking a generic one-flow (the xmit patch of a
> bond dev) and turning it into two (or potentially more) flows, depending
> on the socket type. This also reduces generality.
I don't follow this part.
> > I'm probably missing the problem you're trying to explain to me :S
>
> I kept the patch minimal, and kept the TLS offload logic internal to the
> bond driver, just like it is internal to the device drivers (mlx5e, and
> others), with no core infrastructure modification.
>
> > Side note - Jarod, I'd be happy to take a patch renaming
> > netdev_get_xmit_slave() and the ndo, if you have the cycles to send
> > a patch. It's a recent addition, and in the core we should make more
> > of an effort to avoid sensitive terms.
> >
> >> Currently I have my check there to
> >> catch all skbs belonging to offloaded TLS sockets.
> >>
> >> The TLS offload get_slave() logic decision is per socket, so the result
> >> cannot be saved in the bond memory. Currently I save the real_dev field
> >> in the TLS context structure.
> >
> > Right, but we could just have ctx->netdev be the "egress" netdev
> > always, right? Do we expect somewhere that it's going to be matching
> > the socket's dst?
>
> So once the offload context is established we totally bypass the bond
> dev? and lose interaction or reference to it?
Yup, I don't think we need it.
> What if the egress dev is detached form the bond? We must then be
> notified somehow.
Do we notify TLS when routing changes? I think it's a separate topic.
If we have the code to "un-offload" a flow we could handle clearing
features better and notify from sk_validate_xmit_skb that the flow
started hitting unexpected dev, hence it should be re-offloaded.
I don't think we need an explicit invalidation from the particular
drivers here.
Powered by blists - more mailing lists