netdev - Re: [PATCH xfrm-next v2 0/6] Extend XFRM core to allow full offload configuration

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220818101031.GC566407@gauss3.secunet.de>
Date:   Thu, 18 Aug 2022 12:10:31 +0200
From:   Steffen Klassert <steffen.klassert@...unet.com>
To:     Leon Romanovsky <leon@...nel.org>
CC:     Jakub Kicinski <kuba@...nel.org>,
        "David S . Miller" <davem@...emloft.net>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        <netdev@...r.kernel.org>, Raed Salem <raeds@...dia.com>,
        ipsec-devel <devel@...ux-ipsec.org>,
        Jason Gunthorpe <jgg@...dia.com>
Subject: Re: [PATCH xfrm-next v2 0/6] Extend XFRM core to allow full offload
 configuration

On Thu, Aug 18, 2022 at 08:24:13AM +0300, Leon Romanovsky wrote:
> On Wed, Aug 17, 2022 at 11:10:52AM -0700, Jakub Kicinski wrote:
> > On Wed, 17 Aug 2022 08:22:02 +0300 Leon Romanovsky wrote:
> > > On Tue, Aug 16, 2022 at 07:54:08PM -0700, Jakub Kicinski wrote:
> > > > This is making a precedent for full tunnel offload in netdev, right?  
> > > 
> > > Not really. SW IPsec supports two modes: tunnel and transport.
> > > 
> > > However HW and SW stack supports only offload of transport mode.
> > > This is the case for already merged IPsec crypto offload mode and
> > > the case for this full offload.
> > 
> > My point is on what you called "full offload" vs "crypto offload".
> > The policy so far has always been that Linux networking stack should
> > populate all the headers and instruct the device to do crypto, no
> > header insertion. Obviously we do header insertion in switch/router
> > offloads but that's different and stateless.
> > 
> > I believe the reasoning was to provide as much flexibility and control
> > to the software as possible while retaining most of the performance
> > gains.
> 
> I honestly don't know the reasoning, but "performance gains" are very
> limited as long as IPsec stack involved with various policy/state
> lookups. These lookups are expensive in terms of CPU and they can't
> hold 400 Gb/s line rate.

Can you provide some performance results that show the difference
between crypto and full offload? In particular because on the TX
path, the full policy and state offload is done twice (in software
to find the offloading device and then in hardware to match policy
to state).

> 
> https://docs.nvidia.com/networking/display/connectx7en/Introduction#Introduction-ConnectX-7400GbEAdapterCards
> 
> > 
> > You must provide a clear analysis (as in examination in data) and
> > discussion (as in examination in writing) if you're intending to 
> > change the "let's keep packet formation in the SW" policy. What you 
> > got below is a good start but not sufficient.

I'm still a bit unease about this approach. I fear that doing parts
of statefull IPsec procesing in software and parts in hardware will
lead to all sort of problems. E.g. with this implementation
the software has no stats, liftetime, lifebyte and packet count
information but is responsible to do the IKE communication.

We might be able to sort out all problems during the upstraming
process, but I still have no clear picture how this should work
in the end with all corener cases this creates.

Also the name full offload is a bit missleading, because the
software still has to hold all offloaded states and policies.
In a full offload, the stack would IMO just act as a stub
layer between IKE and hardware.

> > > Some of them:
> > > 1. Request to have reqid for policy and state. I use reqid for HW
> > > matching between policy and state.
> > 
> > reqid?
> 
> Policy and state are matched based on their selectors (src/deet IP, direction ...),
> but they independent. The reqid is XFRM identification that this specific policy
> is connected to this specific state.
> https://www.man7.org/linux/man-pages/man8/ip-xfrm.8.html
> https://docs.kernel.org/networking/xfrm_device.html
> ip x s add ....
>    reqid 0x07 ...
>    offload dev eth4 dir in

Can you elaborate this a bit more? Does that matching happen in
hardware? The reqid is not a unique identifyer to match between
policy and state. You MUST match the selectors as defined in 
https://www.rfc-editor.org/rfc/rfc4301