netdev - Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZA9EMJgoNsxfOhwV@hog>
Date:   Mon, 13 Mar 2023 16:41:36 +0100
From:   Sabrina Dubroca <sd@...asysnail.net>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     netdev@...r.kernel.org, Vadim Fedorenko <vfedorenko@...ek.ru>,
        Frantisek Krenzelok <fkrenzel@...hat.com>,
        Kuniyuki Iwashima <kuniyu@...zon.com>,
        Apoorv Kothari <apoorvko@...zon.com>,
        Boris Pismenny <borisp@...dia.com>,
        John Fastabend <john.fastabend@...il.com>,
        Shuah Khan <shuah@...nel.org>, linux-kselftest@...r.kernel.org,
        Gal Pressman <gal@...dia.com>,
        Marcel Holtmann <marcel@...tmann.org>
Subject: Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3

2023-02-23, 09:29:45 -0800, Jakub Kicinski wrote:
> On Thu, 23 Feb 2023 17:27:40 +0100 Sabrina Dubroca wrote:
> > Installing the key in HW and re-enabling the offload will need to
> > happen via the icsk_clean_acked callback. We'll need a workqueue so
> > that we don't actually talk to the driver from softirq.
> 
> Installing from icsk_clean_acked won't win us anything, right?
> We'll only need the key once the next sendmsg() comes, what's
> pushed to TCP with swenc is already out of our hands.

Avoiding an unpredictable slowdown on the sendmsg() call? We can deal
with that later if it turns out to be an issue. I simply didn't think
of deferring to the next sendmsg().

> > Then, we have to handle a failure to install the key. Since we're not
> > installing it in HW immediately during setsockopt, notifying userspace
> > of a rekey failure is more complicated. Maybe we can do a
> > rekey_prepare during the setsocktopt, and then the actual rekey is an
> > operation that cannot fail?
> 
> TLS offload silently falls back to SW on any errors. So that's fine.
> Just bump a counter. User/infra must be tracking error counters in 
> our current design already.

True. User might be a bit surprised by "well it was offloaded and now
it's not", but ok.

> > > Important consideration is making the non-rekey path as fast as
> > > possible (given rekeying is extremely rare). Looking at skb->decrypted
> > > should be very fast but we can possibly fit some other indication of
> > > "are we rekeying" into another already referenced cache line.
> > > We definitely don't want to have to look up the record to know what
> > > state we're in.
> > > 
> > > The fallback can't use AES-NI (it's in sirq context) so it's slower 
> > > than SW encrypt before queuing to TCP. Hence my first thought is using
> > > SW crypto for new key and let the traffic we already queued with old
> > > key drain leveraging HW crypto. But as I said the impact on performance
> > > when not rekeying is more important, and so is driver simplicity.  
> > 
> > Right, sorry, full tls_sw path and not the existing fallback.
> > 
> > Changing the socket ops back and forth between the HW and SW variants
> > worries me, because we only lock the socket once we have entered
> > tls_{device,sw}_sendmsg. So I think we have to stay on the _device ops
> > even during the SW crypto phase of the rekey, and let that call into
> > the SW variant after locking the socket and making sure we're in a
> > rekey.
> 
> Fair point :S
> 
> > > > Don't we have that already? If there's a retransmit while we're
> > > > setting the TX key in HW, data that was queued on the socket before
> > > > (and shouldn't be encrypted at all) would also be encrypted
> > > > otherwise. Or is it different with rekey?  
> > > 
> > > We have a "start marker" record which is supposed to indicate that
> > > anything before it has already been encrypted. The driver is programmed
> > > with the start seq no, when it sees a packet from before this seq no
> > > it checks if a record exists, finds its before the start marker and
> > > sends the data as is.  
> > 
> > Yes, I was looking into that earlier this week. I think we could reuse
> > a similar mechanism for rekeying. tls_dev_add takes tcp_sk->write_seq,
> > we could have a tls_dev_rekey op passing the new key and new write_seq
> > to the driver. I think we can also reuse the ->eor trick from
> > tls_set_device_offload, and we wouldn't have to look at
> > skb->decrypted. Close and push the current SW record, mark ->eor, pass
> > write_seq to the driver along with the key. Also pretty close to what
> > tls_device_resync_tx does.
> 
> That sounds like you'd expose the rekeying logic to the drivers?
> New op, having to track seq#...

Well, we have to call into the drivers to install the key, whether
that's a new rekey op, or adding an update argument to ->tls_dev_add,
or letting the driver guess that it's a rekey (or ignore that and just
install the key if rekey vs initial key isn't a meaningful
distinction).

We already feed drivers the seq# with ->tls_dev_add, so passing it for
rekeys as well is not a big change.

Does that seem problematic? Adding a rekey op seemed more natural to
me than simply using the existing _del + _add ops, but maybe we can
get away with just using those two ops.

-- 
Sabrina