linux-kernel - Re: [PATCH v3] net/tls: support maximum record size limit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aLgVCGbq0b6PJXbY@krikkit>
Date: Wed, 3 Sep 2025 12:14:32 +0200
From: Sabrina Dubroca <sd@...asysnail.net>
To: Wilfred Mallawa <wilfred.opensource@...il.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
	pabeni@...hat.com, horms@...nel.org, corbet@....net,
	john.fastabend@...il.com, netdev@...r.kernel.org,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
	alistair.francis@....com, dlemoal@...nel.org,
	Wilfred Mallawa <wilfred.mallawa@....com>
Subject: Re: [PATCH v3] net/tls: support maximum record size limit

note: since this is a new feature, the subject prefix should be
"[PATCH net-next vN]" (ie add "net-next", the target tree for "new
feature" changes)

2025-09-03, 11:47:57 +1000, Wilfred Mallawa wrote:
> diff --git a/Documentation/networking/tls.rst b/Documentation/networking/tls.rst
> index 36cc7afc2527..0232df902320 100644
> --- a/Documentation/networking/tls.rst
> +++ b/Documentation/networking/tls.rst
> @@ -280,6 +280,13 @@ If the record decrypted turns out to had been padded or is not a data
>  record it will be decrypted again into a kernel buffer without zero copy.
>  Such events are counted in the ``TlsDecryptRetry`` statistic.
>  
> +TLS_TX_RECORD_SIZE_LIM
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +During a TLS handshake, an endpoint may use the record size limit extension
> +to specify a maximum record size. This allows enforcing the specified record
> +size limit, such that outgoing records do not exceed the limit specified.

Maybe worth adding a reference to the RFC that defines this extension?
I'm not sure if that would be helpful to readers of this doc or not.


> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index a3ccb3135e51..94237c97f062 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
[...]
> @@ -1022,6 +1075,7 @@ static int tls_init(struct sock *sk)
>  
>  	ctx->tx_conf = TLS_BASE;
>  	ctx->rx_conf = TLS_BASE;
> +	ctx->tx_record_size_limit = TLS_MAX_PAYLOAD_SIZE;
>  	update_sk_prot(sk, ctx);
>  out:
>  	write_unlock_bh(&sk->sk_callback_lock);
> @@ -1065,7 +1119,7 @@ static u16 tls_user_config(struct tls_context *ctx, bool tx)
>  
>  static int tls_get_info(struct sock *sk, struct sk_buff *skb, bool net_admin)
>  {
> -	u16 version, cipher_type;
> +	u16 version, cipher_type, tx_record_size_limit;
>  	struct tls_context *ctx;
>  	struct nlattr *start;
>  	int err;
> @@ -1110,7 +1164,13 @@ static int tls_get_info(struct sock *sk, struct sk_buff *skb, bool net_admin)
>  		if (err)
>  			goto nla_failure;
>  	}
> -
> +	tx_record_size_limit = ctx->tx_record_size_limit;
> +	if (tx_record_size_limit) {

You probably meant to update that to:

    tx_record_size_limit != TLS_MAX_PAYLOAD_SIZE

Otherwise, now that the default is TLS_MAX_PAYLOAD_SIZE, it will
always be exported - which is not wrong either. So I'd either update
the conditional so that the attribute is only exported for non-default
sizes (like in v2), or drop the if() and always export it.

> +		err = nla_put_u16(skb, TLS_INFO_TX_RECORD_SIZE_LIM,
> +				  tx_record_size_limit);
> +		if (err)
> +			goto nla_failure;
> +	}

[...]
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index bac65d0d4e3e..28fb796573d1 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -1079,7 +1079,7 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
>  		orig_size = msg_pl->sg.size;
>  		full_record = false;
>  		try_to_copy = msg_data_left(msg);
> -		record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size;
> +		record_room = tls_ctx->tx_record_size_limit - msg_pl->sg.size;

If we entered tls_sw_sendmsg_locked with an existing open record, this
could end up being negative and confuse the rest of the code.

    send(MSG_MORE) returns with an open record of length len1
    setsockopt(TLS_INFO_TX_RECORD_SIZE_LIM, limit < len1)
    send() -> record_room < 0


Possibly not a problem with a "well-behaved" userspace, but we can't
rely on that.


Pushing out the pending "too big" record at the time we set
tx_record_size_limit would likely make the peer close the connection
(because it's already told us to limit our TX size), so I guess we'd
have to split the pending record into tx_record_size_limit chunks
before we start processing the new message (either directly at
setsockopt(TLS_INFO_TX_RECORD_SIZE_LIM) time, or the next send/etc
call). The final push during socket closing, and maybe some more
codepaths that deal with ctx->open_rec, would also have to do that.

I think additional selftests for
    send(MSG_MORE), TLS_INFO_TX_RECORD_SIZE_LIM, send
and
    send(MSG_MORE), TLS_INFO_TX_RECORD_SIZE_LIM, close
verifying the received record sizes would make sense, since it's a bit
tricky to get that right.

-- 
Sabrina