[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yv8lGtYIz4z043aI@unreal>
Date: Fri, 19 Aug 2022 08:52:26 +0300
From: Leon Romanovsky <leon@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Steffen Klassert <steffen.klassert@...unet.com>,
"David S . Miller" <davem@...emloft.net>,
Herbert Xu <herbert@...dor.apana.org.au>,
netdev@...r.kernel.org, Raed Salem <raeds@...dia.com>,
ipsec-devel <devel@...ux-ipsec.org>,
Jason Gunthorpe <jgg@...dia.com>
Subject: Re: [PATCH xfrm-next v2 0/6] Extend XFRM core to allow full offload
configuration
On Thu, Aug 18, 2022 at 07:34:49PM -0700, Jakub Kicinski wrote:
> On Thu, 18 Aug 2022 08:24:13 +0300 Leon Romanovsky wrote:
> > On Wed, Aug 17, 2022 at 11:10:52AM -0700, Jakub Kicinski wrote:
> > > My point is on what you called "full offload" vs "crypto offload".
> > > The policy so far has always been that Linux networking stack should
> > > populate all the headers and instruct the device to do crypto, no
> > > header insertion. Obviously we do header insertion in switch/router
> > > offloads but that's different and stateless.
> > >
> > > I believe the reasoning was to provide as much flexibility and control
> > > to the software as possible while retaining most of the performance
> > > gains.
> >
> > I honestly don't know the reasoning, but "performance gains" are very
> > limited as long as IPsec stack involved with various policy/state
>
> Herm. So you didn't bother figuring out what the current problems are
> but unsurprisingly the solution is "buy our product and let us do it"?
Our hardware didn't support full offload back then and crypto mode
was the one that was supported in our mlx5 FPGA offering. There are
no "other" reasons from our side.
>
> > lookups. These lookups are expensive in terms of CPU and they can't
> > hold 400 Gb/s line rate.
> >
> > https://docs.nvidia.com/networking/display/connectx7en/Introduction#Introduction-ConnectX-7400GbEAdapterCards
> >
> > > You must provide a clear analysis (as in examination in data) and
> > > discussion (as in examination in writing) if you're intending to
> > > change the "let's keep packet formation in the SW" policy. What you
> > > got below is a good start but not sufficient.
> >
> > Can you please point me to an example of such analysis, so I will know
> > what is needed/expected?
>
> I can't, as I said twice now, we don't have any "full crypto" offloads
> AFAIK.
No, I'm asking for an example of "clear analysis (as in examination in
data)", as I don't understand this sentence. I'm not asking for "full
crypto" examples.
This "discussion (as in examination in writing)" part is clear.
>
> > > > IPsec full offload is actually improved version of IPsec crypto mode,
> > > > In full mode, HW is responsible to trim/add headers in addition to
> > > > decrypt/encrypt. In this mode, the packet arrives to the stack as already
> > > > decrypted and vice versa for TX (exits to HW as not-encrypted).
> > > >
> > > > My main motivation is to perform IPsec on RoCE traffic and in our
> > > > preliminary results, we are able to do IPsec full offload in line
> > > > rate. The same goes for ETH traffic.
> > >
> > > If the motivation is RoCE I personally see no reason to provide the
> > > configuration of this functionality via netdev interfaces, but I'll
> > > obviously leave the final decision to Steffen.
> >
> > This is not limited to RoCE, our customers use this offload for ethernet
> > traffic as well.
> >
> > RoCE is a good example of traffic that performs all headers magic in HW,
> > without SW involved.
> >
> > IPsec clearly belongs to netdev and we don't want to duplicate netdev
> > functionality in RDMA. Like I said above, this feature is needed for
> > regular ETH traffic as well.
> >
> > Right now, RoCE and iWARP devices are based on netdev and long-standing
> > agreement ( >20 years ????) that all netdev configurations are done
> > there they belong - in netdev.
>
> Let me be very clear - as far as I'm concerned no part of the RDMA
> stack belongs in netdev. What's there is there, but do not try to use
> that argument to justify more stuff.
>
> If someone from the community thinks that I should have interest in
> working on / helping proprietary protocol stacks please let me know,
> because right now I have none.
No one is asking from you to work on proprietary protocols.
RoCE is IBTA standard protocol and iWARP is IETF one. They both fully
documented and backed by multiple vendors (Intel, IBM, Mellanox, Cavium
...).
There is also interoperability lab https://www.iol.unh.edu/ that runs
various tests. In addition to distro interoperability labs testing.
I invite you to take a look on Jason's presentation "Challenges of the
RDMA subsystem", which he gave 3 years ago, about RDMA and challenges
with netdev.
https://lpc.events/event/4/contributions/364/
Thanks
Powered by blists - more mailing lists