[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADvbK_d+vr-t7D1GZJ86gG6oS+Nzy7MDVh_+7Je6hqCdez4Axw@mail.gmail.com>
Date: Thu, 3 Apr 2025 14:44:18 -0400
From: Xin Long <lucien.xin@...il.com>
To: Ricardo Cañuelo Navarro <rcn@...lia.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@...il.com>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, kernel-dev@...lia.com, linux-sctp@...r.kernel.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH] sctp: check transport existence before processing a send primitive
On Thu, Apr 3, 2025 at 10:42 AM Xin Long <lucien.xin@...il.com> wrote:
>
> On Thu, Apr 3, 2025 at 5:58 AM Ricardo Cañuelo Navarro <rcn@...lia.com> wrote:
> >
> > Thanks for reviewing, answers below:
> >
> > On Wed, Apr 02 2025 at 15:40:56, Xin Long <lucien.xin@...il.com> wrote:
> > > The data send path:
> > >
> > > sctp_endpoint_lookup_assoc() ->
> > > sctp_sendmsg_to_asoc()
> > >
> > > And the transport removal path:
> > >
> > > sctp_sf_do_asconf() ->
> > > sctp_process_asconf() ->
> > > sctp_assoc_rm_peer()
> > >
> > > are both protected by the same socket lock.
> > >
> > > Additionally, when a path is removed, sctp_assoc_rm_peer() updates the
> > > transport of all existing chunks in the send queues (peer->transmitted
> > > and asoc->outqueue.out_chunk_list) to NULL.
> > >
> > > It will be great if you can reproduce the issue locally and help check
> > > how the potential race occurs.
> >
> > That's true but if there isn't enough space in the send buffer, then
> > sctp_sendmsg_to_asoc() will release the lock temporarily.
> >
> Oh right, I missed that. Thanks.
>
> > The scenario that the reproducer generates is the following:
> >
> > Thread A Thread B
> > -------------------- --------------------
> > (1) sctp_sendmsg()
> > lock_sock()
> > sctp_sendmsg_to_asoc()
> > sctp_wait_for_sndbuf()
> > release_sock()
> > sctp_setsockopt(SCTP_SOCKOPT_BINDX_REM)
> > lock_sock()
> > sctp_setsockopt_bindx()
> > sctp_send_asconf_del_ip()
> > ...
> > release_sock()
> > process rcv backlog:
> > sctp_do_sm()
> > sctp_sf_do_asconf()
> > ...
> > sctp_assoc_rm_peer()
> > lock_sock()
> > (2) chunk->transport = transport
> > sctp_primitive_SEND()
> > ...
> > sctp_outq_select_transport()
> > *BUG* switch (new_transport->state)
> >
> >
> > Notes:
> > ------
> >
> > Both threads operate on the same socket.
> >
> > 1. Here, sctp_endpoint_lookup_assoc() finds and returns an existing
> > association and transport.
> >
> > 2. At this point, `transport` is already deleted. chunk->transport is
> > not set to NULL because sctp_assoc_rm_peer() ran _before_ the transport
> > was assigned to the chunk.
> >
> > > We should avoid an extra hashtable lookup on this hot TX path, as it would
> > > negatively impact performance.
> >
> > Good point. I can't really tell the performance impact of the lookup
> > here, my experience with the SCTP implementation is very limited. Do you
> > have any suggestions or alternatives about how to deal with this?
> >
> I think the correct approach is to follow how sctp_assoc_rm_peer()
> handles this.
>
> You can use asoc->peer.last_sent_to (which isn't really used elsewhere)
> to temporarily store the transport before releasing the socket lock and
> sleeping in sctp_sendmsg_to_asoc(). After waking up and reacquiring the
> lock, restore the transport back to asoc->peer.last_sent_to.
>
> Additionally, during an ASCONF update, ensure asoc->peer.last_sent_to
> is set to a valid transport if it matches the transport being removed.
>
> For example:
>
> in sctp_wait_for_sndbuf():
>
> asoc->peer.last_sent_to = *tp;
> release_sock(sk);
> current_timeo = schedule_timeout(current_timeo);
> lock_sock(sk);
> *tp = asoc->peer.last_sent_to;
> asoc->peer.last_sent_to = NULL;
>
> in sctp_assoc_rm_peer():
>
> if (asoc->peer.last_sent_to == peer)
> asoc->peer.last_sent_to = transport;
This change introduces a side effect: when multiple threads send data
on the same asoc using different daddrs, they may interfere with each
other while waiting for buffer space, as each thread updates
asoc->peer.last_sent_to.
You may consider holding a refcnt to the transport (similar to how the
asoc refcnt is held) in sctp_wait_for_sndbuf(), as shown below:
@@ -9225,7 +9225,9 @@ static int sctp_wait_for_sndbuf(struct
sctp_association *asoc, long *timeo_p,
pr_debug("%s: asoc:%p, timeo:%ld, msg_len:%zu\n", __func__, asoc,
*timeo_p, msg_len);
- /* Increment the association's refcnt. */
+ /* Increment the transport and association's refcnt. */
+ if (t)
+ sctp_transport_hold(t);
sctp_association_hold(asoc);
/* Wait on the association specific sndbuf space. */
@@ -9234,7 +9236,7 @@ static int sctp_wait_for_sndbuf(struct
sctp_association *asoc, long *timeo_p,
TASK_INTERRUPTIBLE);
if (asoc->base.dead)
goto do_dead;
- if (!*timeo_p)
+ if (!*timeo_p || (t && t->dead))
goto do_nonblock;
if (sk->sk_err || asoc->state >= SCTP_STATE_SHUTDOWN_PENDING)
goto do_error;
@@ -9259,7 +9261,9 @@ static int sctp_wait_for_sndbuf(struct
sctp_association *asoc, long *timeo_p,
out:
finish_wait(&asoc->wait, &wait);
- /* Release the association's refcnt. */
+ /* Release the transport and association's refcnt. */
+ if (t)
+ sctp_transport_put(t);
sctp_association_put(asoc);
You will need to reintroduce the dead bit in struct sctp_transport and
set it in sctp_transport_free(). Note this field was previously removed in:
commit 47faa1e4c50ec26e6e75dcd1ce53f064bd45f729
Author: Xin Long <lucien.xin@...il.com>
Date: Fri Jan 22 01:49:09 2016 +0800
sctp: remove the dead field of sctp_transport
Thanks.
Powered by blists - more mailing lists