lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+FuTSf3udp_d13Y8wg-vFsF2vttZ_A5_tE-EDj9z+pfZVCf5g@mail.gmail.com>
Date:   Thu, 23 Apr 2020 09:43:21 -0400
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Cambda Zhu <cambda@...ux.alibaba.com>
Cc:     netdev <netdev@...r.kernel.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Dust Li <dust.li@...ux.alibaba.com>,
        Tony Lu <tonylu@...ux.alibaba.com>
Subject: Re: [PATCH net-next v2] net: Add TCP_FORCE_LINGER2 to TCP setsockopt

On Thu, Apr 23, 2020 at 3:36 AM Cambda Zhu <cambda@...ux.alibaba.com> wrote:
>
> This patch adds a new TCP socket option named TCP_FORCE_LINGER2. The
> option has same behavior as TCP_LINGER2, except the tp->linger2 value
> can be greater than sysctl_tcp_fin_timeout if the user_ns is capable
> with CAP_NET_ADMIN.
>
> As a server, different sockets may need different FIN-WAIT timeout and
> in most cases the system default value will be used. The timeout can
> be adjusted by setting TCP_LINGER2 but cannot be greater than the
> system default value. If one socket needs a timeout greater than the
> default, we have to adjust the sysctl which affects all sockets using
> the system default value. And if we want to adjust it for just one
> socket and keep the original value for others, all the other sockets
> have to set TCP_LINGER2. But with TCP_FORCE_LINGER2, the net admin can
> set greater tp->linger2 than the default for one socket and keep
> the sysctl_tcp_fin_timeout unchanged.
>
> Signed-off-by: Cambda Zhu <cambda@...ux.alibaba.com>
> ---
>  Changes in v2:
>    - Add int overflow check.
>
>  include/uapi/linux/capability.h |  1 +
>  include/uapi/linux/tcp.h        |  1 +
>  net/ipv4/tcp.c                  | 11 +++++++++++
>  3 files changed, 13 insertions(+)
>
> diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
> index 272dc69fa080..0e30c9756a04 100644
> --- a/include/uapi/linux/capability.h
> +++ b/include/uapi/linux/capability.h
> @@ -199,6 +199,7 @@ struct vfs_ns_cap_data {
>  /* Allow multicasting */
>  /* Allow read/write of device-specific registers */
>  /* Allow activation of ATM control sockets */
> +/* Allow setting TCP_LINGER2 regardless of sysctl_tcp_fin_timeout */
>
>  #define CAP_NET_ADMIN        12
>
> diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
> index f2acb2566333..e21e0ce98ca1 100644
> --- a/include/uapi/linux/tcp.h
> +++ b/include/uapi/linux/tcp.h
> @@ -128,6 +128,7 @@ enum {
>  #define TCP_CM_INQ             TCP_INQ
>
>  #define TCP_TX_DELAY           37      /* delay outgoing packets by XX usec */
> +#define TCP_FORCE_LINGER2      38      /* Set TCP_LINGER2 regardless of sysctl_tcp_fin_timeout */
>
>
>  #define TCP_REPAIR_ON          1
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 6d87de434377..d8cd1fd66bc1 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -3149,6 +3149,17 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
>                         tcp_enable_tx_delay();
>                 tp->tcp_tx_delay = val;
>                 break;
> +       case TCP_FORCE_LINGER2:
> +               if (val < 0)
> +                       tp->linger2 = -1;
> +               else if (val > INT_MAX / HZ)
> +                       err = -EINVAL;
> +               else if (val > net->ipv4.sysctl_tcp_fin_timeout / HZ &&
> +                        !ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
> +                       tp->linger2 = 0;

Instead of silently falling back to LINGER2 behavior for unprivileged
users, I would fail without privileges, similar to
SO_(SND|RCV)BUFFORCE.

Also, those have capable instead of ns_capable. If there is risk to
system integrity, that is the right choice.

Slight aside, if the original setsockopt had checked optval ==
sizeof(int), we could have added a variant of different size (say,
with an additional flags field), instead of having to create a new
socket option.

> +               else
> +                       tp->linger2 = val * HZ;
> +               break;
>         default:
>                 err = -ENOPROTOOPT;
>                 break;
> --
> 2.16.6
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ