[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54cdcc22-821a-b0fc-2eb6-b88c47049188@gmail.com>
Date: Wed, 15 Mar 2023 22:26:55 +0200
From: Tariq Toukan <ttoukan.linux@...il.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com,
pabeni@...hat.com, borisp@...dia.com, john.fastabend@...il.com,
maximmi@...dia.com, tariqt@...dia.com, vfedorenko@...ek.ru,
Gal Pressman <gal@...dia.com>,
Saeed Mahameed <saeedm@...dia.com>
Subject: Re: [PATCH net-next v3 7/7] tls: rx: do not use the standard
strparser
On 13/03/2023 20:22, Jakub Kicinski wrote:
> On Sun, 12 Mar 2023 19:59:57 +0200 Tariq Toukan wrote:
>> On 09/03/2023 19:54, Jakub Kicinski wrote:
>>> On Thu, 9 Mar 2023 17:15:26 +0200 Tariq Toukan wrote:
>>>> A few fixes were introduced for this patch, but it seems to still cause
>>>> issues.
>>>>
>>>> I'm running simple client/server test with wrk and nginx and TLS RX
>>>> device offload on.
>>>> It fails with TlsDecryptError on the client side for the large file
>>>> (256000b), while succeeding for the small one (10000b). See repro
>>>> details below.
>>>>
>>>> I narrowed the issue down to this offending patch, by applying a few
>>>> reverts (had to solve trivial conflicts):
>>>
>>> What's the sequence of records in terms of being offloaded vs fall back?
>>> Could you whip up a simple ring buffer to see if previous records were
>>> offloaded and what the skb geometries where?
>>
>> Interesting. All records go through the sw fallback.
>>
>> Command:
>> $ wrk_openssl_3_0_0 -b2.2.2.2 -t1 -c1 -d2 --timeout 5s
>> https://2.2.2.3:20443/256000b.img
>
> Is wrk_openssl_3_0_0 a nginx command? Any CX6 DX card can do this?
>
It's the standard wrk client, nothing special here.
We simply rename it to indicate that it was compiled against a specific
openssl version.
>> Debug code:
>> @@ -1712,8 +1723,13 @@ static int tls_rx_one_record(struct sock *sk,
>> struct msghdr *msg,
>> int err;
>>
>> err = tls_decrypt_device(sk, msg, tls_ctx, darg);
>> - if (!err)
>> + if (!err) {
>> err = tls_decrypt_sw(sk, tls_ctx, msg, darg);
>> + printk("sk: %p, tls_decrypt_sw, err = %d\n", sk, err);
>> + } else {
>> + printk("sk: %p, tls_decrypt_device, err = %d\n", sk, err);
>> + }
>> + skb_dump(KERN_ERR, darg->skb, false);
>> if (err < 0)
>> return err;
>>
>> dmesg output including skb geometries is attached.
>
> Hm, could you add to the debug the addresses of the fragments
> (and decrypted status) of the data queued to TCP by the driver?
> And then the frag addresses in skb_dump() ?
>
I used the debug code below. Find dmesg output attached.
I disabled GRO to reduce complexity.
# cat /proc/net/tls_stat
TlsCurrTxSw 0
TlsCurrRxSw 0
TlsCurrTxDevice 1
TlsCurrRxDevice 1
TlsTxSw 0
TlsRxSw 0
TlsTxDevice 2
TlsRxDevice 2
TlsDecryptError 2
TlsRxDeviceResync 2
TlsDecryptRetry 0
TlsRxNoPadViolation 0
Insights:
In this repro, we always go through tls_decrypt_sw, never to
tls_decrypt_device.
The corrupted SKB is the one with mixed offloaded and non-offloaded
decryption. You can clearly see this by its len (2330624 bytes!),
followed by a kernel panic.
Calling skb_dump on the corrupted skb also hits the kernel panic.
Code:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 3f7b63d6616b..864288b5d412 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -2262,6 +2262,7 @@ static void mlx5e_handle_rx_cqe_mpwrq(struct
mlx5e_rq *rq, struct mlx5_cqe64 *cq
goto mpwrq_cqe_out;
}
+ printk("driver calling napi_gro_receive for skb %p, is decrypt =
%d\n", skb, skb->decrypted);
napi_gro_receive(rq->cq.napi, skb);
mpwrq_cqe_out:
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 782d3701b86f..3858299196d3 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1710,10 +1710,19 @@ static int tls_rx_one_record(struct sock *sk,
struct msghdr *msg,
struct tls_prot_info *prot = &tls_ctx->prot_info;
struct strp_msg *rxm;
int err;
+ int i;
err = tls_decrypt_device(sk, msg, tls_ctx, darg);
- if (!err)
+ if (!err) {
err = tls_decrypt_sw(sk, tls_ctx, msg, darg);
+ printk("sk: %p, tls_decrypt_sw, err = %d\n", sk, err);
+ } else {
+ printk("sk: %p, tls_decrypt_device, err = %d\n", sk, err);
+ }
+ printk("skb: %p, len = %d, nfrags = %d\n", darg->skb,
darg->skb->len, skb_shinfo(darg->skb)->nr_frags);
+ for (i = 0; i < skb_shinfo(darg->skb)->nr_frags; i++)
+ printk("frag#%d: %p, len %d\n", i,
&skb_shinfo(darg->skb)->frags[i],
+
skb_frag_size(&skb_shinfo(darg->skb)->frags[i]));
if (err < 0)
return err;
> tls_decrypt_sw() will also get used in partially decrypted records,
> right?
Right.
View attachment "dmesg_tls_rx_decrypt_err_2.txt" of type "text/plain" (41837 bytes)
Powered by blists - more mailing lists