lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210127175611.62871-1-kuniyu@amazon.co.jp>
Date:   Thu, 28 Jan 2021 02:56:11 +0900
From:   Kuniyuki Iwashima <kuniyu@...zon.co.jp>
To:     <edumazet@...gle.com>
CC:     <aams@...zon.de>, <borisp@...lanox.com>, <davem@...emloft.net>,
        <kuba@...nel.org>, <kuni1840@...il.com>, <kuniyu@...zon.co.jp>,
        <linux-kernel@...r.kernel.org>, <netdev@...r.kernel.org>,
        <tariqt@...lanox.com>
Subject: Re: [PATCH net] net: Remove redundant calls of sk_tx_queue_clear().

From:   Eric Dumazet <edumazet@...gle.com>
Date:   Wed, 27 Jan 2021 18:34:35 +0100
> On Wed, Jan 27, 2021 at 6:32 PM Kuniyuki Iwashima <kuniyu@...zon.co.jp> wrote:
> >
> > From:   Eric Dumazet <edumazet@...gle.com>
> > Date:   Wed, 27 Jan 2021 18:05:24 +0100
> > > On Wed, Jan 27, 2021 at 5:52 PM Kuniyuki Iwashima <kuniyu@...zon.co.jp> wrote:
> > > >
> > > > From:   Eric Dumazet <edumazet@...gle.com>
> > > > Date:   Wed, 27 Jan 2021 15:54:32 +0100
> > > > > On Wed, Jan 27, 2021 at 1:50 PM Kuniyuki Iwashima <kuniyu@...zon.co.jp> wrote:
> > > > > >
> > > > > > The commit 41b14fb8724d ("net: Do not clear the sock TX queue in
> > > > > > sk_set_socket()") removes sk_tx_queue_clear() from sk_set_socket() and adds
> > > > > > it instead in sk_alloc() and sk_clone_lock() to fix an issue introduced in
> > > > > > the commit e022f0b4a03f ("net: Introduce sk_tx_queue_mapping"). However,
> > > > > > the original commit had already put sk_tx_queue_clear() in sk_prot_alloc():
> > > > > > the callee of sk_alloc() and sk_clone_lock(). Thus sk_tx_queue_clear() is
> > > > > > called twice in each path currently.
> > > > >
> > > > > Are you sure ?
> > > > >
> > > > > I do not clearly see the sk_tx_queue_clear() call from the cloning part.
> > > > >
> > > > > Please elaborate.
> > > >
> > > > If sk is not NULL in sk_prot_alloc(), sk_tx_queue_clear() is called [1].
> > > > Also the callers of sk_prot_alloc() are only sk_alloc() and sk_clone_lock().
> > > > If they finally return not NULL pointer, sk_tx_queue_clear() is called in
> > > > each function [2][3].
> > > >
> > > > In the cloning part, sock_copy() is called after sk_prot_alloc(), but
> > > > skc_tx_queue_mapping is defined between skc_dontcopy_begin and
> > > > skc_dontcopy_end in struct sock_common [4]. So, sock_copy() does not
> > > > overwrite skc_tx_queue_mapping, and thus we can initialize it in
> > > > sk_prot_alloc().
> > >
> > > That is a lot of assumptions.
> > >
> > > What guarantees do we have that skc_tx_queue_mapping will never be
> > > moved out of this section ?
> > > AFAIK it was there by accident, for cache locality reasons, that might
> > > change in the future as we add more stuff in socket.
> > >
> > > I feel this optimization is risky for future changes, for a code path
> > > that is spending thousands of cycles anyway.
> >
> > If someone try to move skc_tx_queue_mapping out of the section, should
> > they take care about where it is used ?

I'm sorry if it might be misleading, I would like to mean someone/they is
the author of a patch to move skc_tx_queue_mapping.


> Certainly not. You hide some knowledge, without a comment or some runtime check.

It was my bad, I should have written about sock_copy() in the changelog.


> You can not ask us (maintainers) to remember thousands of tricks.

I'll keep this in mind.


> >
> > But I agree that we should not write error-prone code.
> >
> > Currently, sk_tx_queue_clear() is the only initialization code in
> > sk_prot_alloc(). So, does it make sense to remove sk_tx_queue_clear() in
> > sk_prot_alloc() so that it does only allocation and other fields are
> > initialized in each caller ?

Can I ask what you think about this ?


> > > >
> > > > [1] sk_prot_alloc
> > > > https://github.com/torvalds/linux/blob/master/net/core/sock.c#L1693
> > > >
> > > > [2] sk_alloc
> > > > https://github.com/torvalds/linux/blob/master/net/core/sock.c#L1762
> > > >
> > > > [3] sk_clone_lock
> > > > https://github.com/torvalds/linux/blob/master/net/core/sock.c#L1986
> > > >
> > > > [4] struct sock_common
> > > > https://github.com/torvalds/linux/blob/master/include/net/sock.h#L218-L240
> > > >
> > > >
> > > > > In any case, this seems to be a candidate for net-next, this is not
> > > > > fixing a bug,
> > > > > this would be an optimization at most, and potentially adding a bug.
> > > > >
> > > > > So if you resend this patch, you can mention the old commit in the changelog,
> > > > > but do not add a dubious Fixes: tag
> > > >
> > > > I see.
> > > >
> > > > I will remove the tag and resend this as a net-next candidate.
> > > >
> > > > Thank you,
> > > > Kuniyuki
> > > >
> > > >
> > > > > >
> > > > > > This patch removes the redundant calls of sk_tx_queue_clear() in sk_alloc()
> > > > > > and sk_clone_lock().
> > > > > >
> > > > > > Fixes: 41b14fb8724d ("net: Do not clear the sock TX queue in sk_set_socket()")
> > > > > > CC: Tariq Toukan <tariqt@...lanox.com>
> > > > > > CC: Boris Pismenny <borisp@...lanox.com>
> > > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.co.jp>
> > > > > > Reviewed-by: Amit Shah <aams@...zon.de>
> > > > > > ---
> > > > > >  net/core/sock.c | 2 --
> > > > > >  1 file changed, 2 deletions(-)
> > > > > >
> > > > > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > > > > index bbcd4b97eddd..5c665ee14159 100644
> > > > > > --- a/net/core/sock.c
> > > > > > +++ b/net/core/sock.c
> > > > > > @@ -1759,7 +1759,6 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t priority,
> > > > > >                 cgroup_sk_alloc(&sk->sk_cgrp_data);
> > > > > >                 sock_update_classid(&sk->sk_cgrp_data);
> > > > > >                 sock_update_netprioidx(&sk->sk_cgrp_data);
> > > > > > -               sk_tx_queue_clear(sk);
> > > > > >         }
> > > > > >
> > > > > >         return sk;
> > > > > > @@ -1983,7 +1982,6 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
> > > > > >                  */
> > > > > >                 sk_refcnt_debug_inc(newsk);
> > > > > >                 sk_set_socket(newsk, NULL);
> > > > > > -               sk_tx_queue_clear(newsk);
> > > > > >                 RCU_INIT_POINTER(newsk->sk_wq, NULL);
> > > > > >
> > > > > >                 if (newsk->sk_prot->sockets_allocated)
> > > > > > --
> > > > > > 2.17.2 (Apple Git-113)
> > > > > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ