lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 22 Jun 2020 09:56:28 -0700 From: Eric Dumazet <eric.dumazet@...il.com> To: Tariq Toukan <tariqt@...lanox.com>, Eric Dumazet <eric.dumazet@...il.com>, "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org> Cc: netdev@...r.kernel.org, Saeed Mahameed <saeedm@...lanox.com>, Boris Pismenny <borisp@...lanox.com> Subject: Re: [PATCH net] net: Do not clear the socket TX queue in sock_orphan() On 6/22/20 9:24 AM, Tariq Toukan wrote: > > > On 6/22/2020 6:53 PM, Eric Dumazet wrote: >> >> >> On 6/21/20 7:09 AM, Tariq Toukan wrote: >>> sock_orphan() call to sk_set_socket() implies clearing the sock TX queue. >>> This might cause unexpected out-of-order transmit, as outstanding packets >>> can pick a different TX queue and bypass the ones already queued. >>> This is undesired in general. More specifically, it breaks the in-order >>> scheduling property guarantee for device-offloaded TLS sockets. >>> >>> Introduce a function variation __sk_set_socket() that does not clear >>> the TX queue, and call it from sock_orphan(). >>> All other callers of sk_set_socket() do not operate on an active socket, >>> so they do not need this change. >>> >>> Fixes: e022f0b4a03f ("net: Introduce sk_tx_queue_mapping") >>> Signed-off-by: Tariq Toukan <tariqt@...lanox.com> >>> Reviewed-by: Boris Pismenny <borisp@...lanox.com> >>> --- >>> include/net/sock.h | 9 +++++++-- >>> 1 file changed, 7 insertions(+), 2 deletions(-) >>> >>> Please queue for -stable. >>> >>> diff --git a/include/net/sock.h b/include/net/sock.h >>> index c53cc42b5ab9..23e43f3d79f0 100644 >>> --- a/include/net/sock.h >>> +++ b/include/net/sock.h >>> @@ -1846,10 +1846,15 @@ static inline int sk_rx_queue_get(const struct sock *sk) >>> } >>> #endif >>> +static inline void __sk_set_socket(struct sock *sk, struct socket *sock) >>> +{ >>> + sk->sk_socket = sock; >>> +} >>> + >>> static inline void sk_set_socket(struct sock *sk, struct socket *sock) >>> { >>> sk_tx_queue_clear(sk); >>> - sk->sk_socket = sock; >>> + __sk_set_socket(sk, sock); >>> } >>> >> >> Hmm... >> >> I think we should have a single sk_set_socket() call, and remove >> the sk_tx_queue_clear() from it. >> >> sk_tx_queue_clear() should be put where it is needed, instead of being hidden >> in sk_set_socket() >> > > Thanks Eric, sounds good to me, I will respin, just have some questions below. > >> diff --git a/include/net/sock.h b/include/net/sock.h >> index c53cc42b5ab92d0062519e60435b85c75564a967..3428619faae4340485b200f49d9cce4fb09086b3 100644 >> --- a/include/net/sock.h >> +++ b/include/net/sock.h >> @@ -1848,7 +1848,6 @@ static inline int sk_rx_queue_get(const struct sock *sk) >> static inline void sk_set_socket(struct sock *sk, struct socket *sock) >> { >> - sk_tx_queue_clear(sk); >> sk->sk_socket = sock; >> } >> diff --git a/net/core/sock.c b/net/core/sock.c >> index 6c4acf1f0220b1f925ebcfaa847632ec0dbe0b9b..134de0d37f77ba781b2b3341c94a97a1b2d57a2d 100644 >> --- a/net/core/sock.c >> +++ b/net/core/sock.c >> @@ -1767,6 +1767,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t priority, >> cgroup_sk_alloc(&sk->sk_cgrp_data); >> sock_update_classid(&sk->sk_cgrp_data); >> sock_update_netprioidx(&sk->sk_cgrp_data); >> + sk_tx_queue_clear(sk); > > Why add it here? > I don't see a call to sk_set_socket(). Yes, but the intent is to set the initial value of sk->sk_tx_queue_mapping (USHRT_MAX) when socket object is allocated/populated, not later in some unrelated paths. We move into one location all the initializers. (Most fields initial value is 0, so we do not clear them a second time) > >> } >> return sk; >> @@ -1990,6 +1991,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority) >> */ >> sk_refcnt_debug_inc(newsk); >> sk_set_socket(newsk, NULL); >> + sk_tx_queue_clear(newsk); >> RCU_INIT_POINTER(newsk->sk_wq, NULL); >> if (newsk->sk_prot->sockets_allocated) >> > > So in addition to sock_orphan(), now sk_tx_queue_clear() won't be called also from: > 1. sock_graft(): Looks fine to me. > 2. sock_init_data(): I think we should add an explicit call to sk_tx_queue_clear() here. The one for RX sk_rx_queue_clear() is already there. Why ? Initial value found is socket should be just fine. If not, normal skb->ooo_okay should prevail, if protocols really care about OOO. > 3. mptcp_sock_graft(): Looks fine to me. > > Regards, > Tariq
Powered by blists - more mailing lists