[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aLgVCGbq0b6PJXbY@krikkit>
Date: Wed, 3 Sep 2025 12:14:32 +0200
From: Sabrina Dubroca <sd@...asysnail.net>
To: Wilfred Mallawa <wilfred.opensource@...il.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, horms@...nel.org, corbet@....net,
john.fastabend@...il.com, netdev@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
alistair.francis@....com, dlemoal@...nel.org,
Wilfred Mallawa <wilfred.mallawa@....com>
Subject: Re: [PATCH v3] net/tls: support maximum record size limit
note: since this is a new feature, the subject prefix should be
"[PATCH net-next vN]" (ie add "net-next", the target tree for "new
feature" changes)
2025-09-03, 11:47:57 +1000, Wilfred Mallawa wrote:
> diff --git a/Documentation/networking/tls.rst b/Documentation/networking/tls.rst
> index 36cc7afc2527..0232df902320 100644
> --- a/Documentation/networking/tls.rst
> +++ b/Documentation/networking/tls.rst
> @@ -280,6 +280,13 @@ If the record decrypted turns out to had been padded or is not a data
> record it will be decrypted again into a kernel buffer without zero copy.
> Such events are counted in the ``TlsDecryptRetry`` statistic.
>
> +TLS_TX_RECORD_SIZE_LIM
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +During a TLS handshake, an endpoint may use the record size limit extension
> +to specify a maximum record size. This allows enforcing the specified record
> +size limit, such that outgoing records do not exceed the limit specified.
Maybe worth adding a reference to the RFC that defines this extension?
I'm not sure if that would be helpful to readers of this doc or not.
> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index a3ccb3135e51..94237c97f062 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
[...]
> @@ -1022,6 +1075,7 @@ static int tls_init(struct sock *sk)
>
> ctx->tx_conf = TLS_BASE;
> ctx->rx_conf = TLS_BASE;
> + ctx->tx_record_size_limit = TLS_MAX_PAYLOAD_SIZE;
> update_sk_prot(sk, ctx);
> out:
> write_unlock_bh(&sk->sk_callback_lock);
> @@ -1065,7 +1119,7 @@ static u16 tls_user_config(struct tls_context *ctx, bool tx)
>
> static int tls_get_info(struct sock *sk, struct sk_buff *skb, bool net_admin)
> {
> - u16 version, cipher_type;
> + u16 version, cipher_type, tx_record_size_limit;
> struct tls_context *ctx;
> struct nlattr *start;
> int err;
> @@ -1110,7 +1164,13 @@ static int tls_get_info(struct sock *sk, struct sk_buff *skb, bool net_admin)
> if (err)
> goto nla_failure;
> }
> -
> + tx_record_size_limit = ctx->tx_record_size_limit;
> + if (tx_record_size_limit) {
You probably meant to update that to:
tx_record_size_limit != TLS_MAX_PAYLOAD_SIZE
Otherwise, now that the default is TLS_MAX_PAYLOAD_SIZE, it will
always be exported - which is not wrong either. So I'd either update
the conditional so that the attribute is only exported for non-default
sizes (like in v2), or drop the if() and always export it.
> + err = nla_put_u16(skb, TLS_INFO_TX_RECORD_SIZE_LIM,
> + tx_record_size_limit);
> + if (err)
> + goto nla_failure;
> + }
[...]
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index bac65d0d4e3e..28fb796573d1 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -1079,7 +1079,7 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
> orig_size = msg_pl->sg.size;
> full_record = false;
> try_to_copy = msg_data_left(msg);
> - record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size;
> + record_room = tls_ctx->tx_record_size_limit - msg_pl->sg.size;
If we entered tls_sw_sendmsg_locked with an existing open record, this
could end up being negative and confuse the rest of the code.
send(MSG_MORE) returns with an open record of length len1
setsockopt(TLS_INFO_TX_RECORD_SIZE_LIM, limit < len1)
send() -> record_room < 0
Possibly not a problem with a "well-behaved" userspace, but we can't
rely on that.
Pushing out the pending "too big" record at the time we set
tx_record_size_limit would likely make the peer close the connection
(because it's already told us to limit our TX size), so I guess we'd
have to split the pending record into tx_record_size_limit chunks
before we start processing the new message (either directly at
setsockopt(TLS_INFO_TX_RECORD_SIZE_LIM) time, or the next send/etc
call). The final push during socket closing, and maybe some more
codepaths that deal with ctx->open_rec, would also have to do that.
I think additional selftests for
send(MSG_MORE), TLS_INFO_TX_RECORD_SIZE_LIM, send
and
send(MSG_MORE), TLS_INFO_TX_RECORD_SIZE_LIM, close
verifying the received record sizes would make sense, since it's a bit
tricky to get that right.
--
Sabrina
Powered by blists - more mailing lists