[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87tt75efdj.fsf@igalia.com>
Date: Thu, 03 Apr 2025 11:58:00 +0200
From: Ricardo CaƱuelo Navarro <rcn@...lia.com>
To: Xin Long <lucien.xin@...il.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@...il.com>, "David S. Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
<kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon Horman
<horms@...nel.org>, kernel-dev@...lia.com, linux-sctp@...r.kernel.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH] sctp: check transport existence before processing a
send primitive
Thanks for reviewing, answers below:
On Wed, Apr 02 2025 at 15:40:56, Xin Long <lucien.xin@...il.com> wrote:
> The data send path:
>
> sctp_endpoint_lookup_assoc() ->
> sctp_sendmsg_to_asoc()
>
> And the transport removal path:
>
> sctp_sf_do_asconf() ->
> sctp_process_asconf() ->
> sctp_assoc_rm_peer()
>
> are both protected by the same socket lock.
>
> Additionally, when a path is removed, sctp_assoc_rm_peer() updates the
> transport of all existing chunks in the send queues (peer->transmitted
> and asoc->outqueue.out_chunk_list) to NULL.
>
> It will be great if you can reproduce the issue locally and help check
> how the potential race occurs.
That's true but if there isn't enough space in the send buffer, then
sctp_sendmsg_to_asoc() will release the lock temporarily.
The scenario that the reproducer generates is the following:
Thread A Thread B
-------------------- --------------------
(1) sctp_sendmsg()
lock_sock()
sctp_sendmsg_to_asoc()
sctp_wait_for_sndbuf()
release_sock()
sctp_setsockopt(SCTP_SOCKOPT_BINDX_REM)
lock_sock()
sctp_setsockopt_bindx()
sctp_send_asconf_del_ip()
...
release_sock()
process rcv backlog:
sctp_do_sm()
sctp_sf_do_asconf()
...
sctp_assoc_rm_peer()
lock_sock()
(2) chunk->transport = transport
sctp_primitive_SEND()
...
sctp_outq_select_transport()
*BUG* switch (new_transport->state)
Notes:
------
Both threads operate on the same socket.
1. Here, sctp_endpoint_lookup_assoc() finds and returns an existing
association and transport.
2. At this point, `transport` is already deleted. chunk->transport is
not set to NULL because sctp_assoc_rm_peer() ran _before_ the transport
was assigned to the chunk.
> We should avoid an extra hashtable lookup on this hot TX path, as it would
> negatively impact performance.
Good point. I can't really tell the performance impact of the lookup
here, my experience with the SCTP implementation is very limited. Do you
have any suggestions or alternatives about how to deal with this?
Thanks,
Ricardo
Powered by blists - more mailing lists