lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADvbK_evd2=Cs5EZGf3EVBiY5SvF_aOtbu6wMjj_mSgFgfBpzw@mail.gmail.com>
Date: Thu, 6 Nov 2025 14:24:21 -0500
From: Xin Long <lucien.xin@...il.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: network dev <netdev@...r.kernel.org>, quic@...ts.linux.dev, davem@...emloft.net, 
	kuba@...nel.org, Eric Dumazet <edumazet@...gle.com>, Simon Horman <horms@...nel.org>, 
	Stefan Metzmacher <metze@...ba.org>, Moritz Buhl <mbuhl@...nbsd.org>, Tyler Fanelli <tfanelli@...hat.com>, 
	Pengtao He <hepengtao@...omi.com>, Thomas Dreibholz <dreibh@...ula.no>, linux-cifs@...r.kernel.org, 
	Steve French <smfrench@...il.com>, Namjae Jeon <linkinjeon@...nel.org>, 
	Paulo Alcantara <pc@...guebit.com>, Tom Talpey <tom@...pey.com>, kernel-tls-handshake@...ts.linux.dev, 
	Chuck Lever <chuck.lever@...cle.com>, Jeff Layton <jlayton@...nel.org>, 
	Steve Dickson <steved@...hat.com>, Hannes Reinecke <hare@...e.de>, Alexander Aring <aahringo@...hat.com>, 
	David Howells <dhowells@...hat.com>, Matthieu Baerts <matttbe@...nel.org>, 
	John Ericson <mail@...nericson.me>, Cong Wang <xiyou.wangcong@...il.com>, 
	"D . Wythe" <alibuda@...ux.alibaba.com>, Jason Baron <jbaron@...mai.com>, 
	illiliti <illiliti@...tonmail.com>, Sabrina Dubroca <sd@...asysnail.net>, 
	Marcelo Ricardo Leitner <marcelo.leitner@...il.com>, Daniel Stenberg <daniel@...x.se>, 
	Andy Gospodarek <andrew.gospodarek@...adcom.com>
Subject: Re: [PATCH net-next v4 15/15] quic: add packet builder and parser base

On Tue, Nov 4, 2025 at 9:44 AM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On 10/29/25 3:35 PM, Xin Long wrote:
> > +/* Process PMTU reduction event on a QUIC socket. */
> > +void quic_packet_rcv_err_pmtu(struct sock *sk)
> > +{
> > +     struct quic_path_group *paths = quic_paths(sk);
> > +     struct quic_packet *packet = quic_packet(sk);
> > +     struct quic_config *c = quic_config(sk);
> > +     u32 pathmtu, info, taglen;
> > +     struct dst_entry *dst;
> > +     bool reset_timer;
> > +
> > +     if (!ip_sk_accept_pmtu(sk))
> > +             return;
> > +
> > +     info = clamp(paths->mtu_info, QUIC_PATH_MIN_PMTU, QUIC_PATH_MAX_PMTU);
> > +     /* If PLPMTUD is not enabled, update MSS using the route and ICMP info. */
> > +     if (!c->plpmtud_probe_interval) {
> > +             if (quic_packet_route(sk) < 0)
> > +                     return;
> > +
> > +             dst = __sk_dst_get(sk);
> > +             dst->ops->update_pmtu(dst, sk, NULL, info, true);
> > +             quic_packet_mss_update(sk, info - packet->hlen);
> > +             return;
> > +     }
> > +     /* PLPMTUD is enabled: adjust to smaller PMTU, subtract headers and AEAD tag.  Also
> > +      * notify the QUIC path layer for possible state changes and probing.
> > +      */
> > +     taglen = quic_packet_taglen(packet);
> > +     info = info - packet->hlen - taglen;
> > +     pathmtu = quic_path_pl_toobig(paths, info, &reset_timer);
> > +     if (reset_timer)
> > +             quic_timer_reset(sk, QUIC_TIMER_PMTU, c->plpmtud_probe_interval);
> > +     if (pathmtu)
> > +             quic_packet_mss_update(sk, pathmtu + taglen);
> > +}
> > +
> > +/* Handle ICMP Toobig packet and update QUIC socket path MTU. */
> > +static int quic_packet_rcv_err(struct sk_buff *skb)
> > +{
> > +     union quic_addr daddr, saddr;
> > +     struct sock *sk = NULL;
> > +     int ret = 0;
> > +     u32 info;
> > +
> > +     /* All we can do is lookup the matching QUIC socket by addresses. */
> > +     quic_get_msg_addrs(skb, &saddr, &daddr);
> > +     sk = quic_sock_lookup(skb, &daddr, &saddr, NULL);
> > +     if (!sk)
> > +             return -ENOENT;
> > +
> > +     bh_lock_sock(sk);
> > +     if (quic_is_listen(sk))
>
> The above looks race-prone. You should check the status only when
> holding the sk socket lock, i.e. if !sock_owned_by_user(sk)
This check can be deleted now, as quic_sock_lookup()
has only returned regular socket since I moved listen socket to
another hashtable.

>
> > +             goto out;
> > +
> > +     if (quic_get_mtu_info(skb, &info))
> > +             goto out;
>
> This can be moved outside the lock.
>
Right.

> > +
> > +     ret = 1; /* Success: update socket path MTU info. */
> > +     quic_paths(sk)->mtu_info = info;
> > +     if (sock_owned_by_user(sk)) {
> > +             /* Socket is in use by userspace context.  Defer MTU processing to later via
> > +              * tasklet.  Ensure the socket is not dropped before deferral.
> > +              */
> > +             if (!test_and_set_bit(QUIC_MTU_REDUCED_DEFERRED, &sk->sk_tsq_flags))
> > +                     sock_hold(sk);
> > +             goto out;
> > +     }
> > +     /* Otherwise, process the MTU reduction now. */
> > +     quic_packet_rcv_err_pmtu(sk);
> > +out:
> > +     bh_unlock_sock(sk);
> > +     sock_put(sk);
> > +     return ret;
> > +}
> > +
> > +#define QUIC_PACKET_BACKLOG_MAX              4096
> > +
> > +/* Queue a packet for later processing when sleeping is allowed. */
> > +static int quic_packet_backlog_schedule(struct net *net, struct sk_buff *skb)
> > +{
> > +     struct quic_skb_cb *cb = QUIC_SKB_CB(skb);
> > +     struct quic_net *qn = quic_net(net);
> > +
> > +     if (cb->backlog)
> > +             return 0;
>
> The above test is present also in the only caller of this function. It
> should be removed from there.
I may delete the one from the caller, as there will be other
callers in the 2nd patchset need it to be checked in here.

>
> [...]> +/* Work function to process packets in the backlog queue. */
> > +void quic_packet_backlog_work(struct work_struct *work)
> > +{
> > +     struct quic_net *qn = container_of(work, struct quic_net, work);
> > +     struct sk_buff *skb;
> > +     struct sock *sk;
> > +
> > +     skb = skb_dequeue(&qn->backlog_list);
> > +     while (skb) {
> > +             sk = quic_packet_get_listen_sock(skb);
> > +             if (!sk)
> > +                     continue;
> > +
> > +             lock_sock(sk);
>
> Possibly lock_sock_fast(sk);
These are some control QUIC packets (not on the main data path) for
which we need to generate keys from header fields in order to process
them.

However, the kernel crypto API cannot install a key into a tfm in
atomic context, so we must use a workqueue to handle key generation,
installation, and then decryption.

lock_sock_fast() can not be used here,  otherwise this path runs in
atomic context again.

>
> > +             quic_packet_process(sk, skb);
> > +             release_sock(sk);
> > +             sock_put(sk);
> > +
> > +             skb = skb_dequeue(&qn->backlog_list);
> > +     }
> > +}
>
> [...]> +/* Create and transmit a new QUIC packet. */
> > +int quic_packet_create(struct sock *sk)
> Possibly rename the function accordingly to its actual action, i.e.
> quic_packet_create_xmit()
I will use quic_packet_create_and_xmit().

>
> [...]> @@ -291,6 +294,8 @@ static void __net_exit quic_net_exit(struct
> net *net)
> >  #ifdef CONFIG_PROC_FS
> >       quic_net_proc_exit(net);
> >  #endif
> > +     skb_queue_purge(&qn->backlog_list);
> > +     cancel_work_sync(&qn->work);
>
> Likely: disable_work_sync()
>
Will update it.

> >       quic_crypto_free(&qn->crypto);
> >       free_percpu(qn->stat);
> >       qn->stat = NULL;
>
> EPATCHISTOOBIG, very hard to process. Please split this one it at least
> 2 (i.e. rx and tx part), even if the series will grow above 15
>
Sure, I will do it.

BTW, about the MAINTAINERS entry Jakub mentioned, should I
create a new patch for it, or append it in one of these patches, like into:

[PATCH net-next 02/15] net: build socket infrastructure for QUIC protocol

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ