lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 19 Nov 2020 15:09:48 -0800
From:   Wei Wang <weiwan@...gle.com>
To:     Alexander Duyck <alexander.duyck@...il.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        bpf@...r.kernel.org, daniel@...earbox.net,
        Martin KaFai Lau <kafai@...com>, kernel-team@...com,
        Eric Dumazet <edumazet@...gle.com>, brakmo@...com,
        alexanderduyck@...com
Subject: Re: [net PATCH 1/2] tcp: Allow full IP tos/IPv6 tclass to be
 reflected in L3 header

On Thu, Nov 19, 2020 at 1:23 PM Alexander Duyck
<alexander.duyck@...il.com> wrote:
>
> From: Alexander Duyck <alexanderduyck@...com>
>
> An issue was recently found where DCTCP SYN/ACK packets did not have the
> ECT bit set in the L3 header. A bit of code review found that the recent
> change referenced below had gone though and added a mask that prevented the
> ECN bits from being populated in the L3 header.
>
> This patch addresses that by rolling back the mask so that it is only
> applied to the flags coming from the incoming TCP request instead of
> applying it to the socket tos/tclass field. Doing this the ECT bits were
> restored in the SYN/ACK packets in my testing.
>
> One thing that is not addressed by this patch set is the fact that
> tcp_reflect_tos appears to be incompatible with ECN based congestion
> avoidance algorithms. At a minimum the feature should likely be documented
> which it currently isn't.
>
> Fixes: ac8f1710c12b ("tcp: reflect tos value received in SYN to the socket")

Acked-by: Wei Wang <weiwan@...gle.com>

Thanks for catching this. I was under the wrong impression that the
ECT bits were marked in tos after the tcp layer. Upon a closer look,
it seems right now, it only gets marked in inet_sock(sk)->tos from
tcp_init_congestion_control() once.
I will submit a follow-up fix to make sure we include the lower 2 bits
in the reflection case as well.

> Signed-off-by: Alexander Duyck <alexanderduyck@...com>
> ---
>  net/ipv4/tcp_ipv4.c |    5 +++--
>  net/ipv6/tcp_ipv6.c |    6 +++---
>  2 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index c2d5132c523c..c5f8b686aa82 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -981,7 +981,8 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst,
>         skb = tcp_make_synack(sk, dst, req, foc, synack_type, syn_skb);
>
>         tos = sock_net(sk)->ipv4.sysctl_tcp_reflect_tos ?
> -                       tcp_rsk(req)->syn_tos : inet_sk(sk)->tos;
> +                       tcp_rsk(req)->syn_tos & ~INET_ECN_MASK :
> +                       inet_sk(sk)->tos;
>
>         if (skb) {
>                 __tcp_v4_send_check(skb, ireq->ir_loc_addr, ireq->ir_rmt_addr);
> @@ -990,7 +991,7 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst,
>                 err = ip_build_and_send_pkt(skb, sk, ireq->ir_loc_addr,
>                                             ireq->ir_rmt_addr,
>                                             rcu_dereference(ireq->ireq_opt),
> -                                           tos & ~INET_ECN_MASK);
> +                                           tos);
>                 rcu_read_unlock();
>                 err = net_xmit_eval(err);
>         }
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 8db59f4e5f13..3d49e8d0afee 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -530,12 +530,12 @@ static int tcp_v6_send_synack(const struct sock *sk, struct dst_entry *dst,
>                 rcu_read_lock();
>                 opt = ireq->ipv6_opt;
>                 tclass = sock_net(sk)->ipv4.sysctl_tcp_reflect_tos ?
> -                               tcp_rsk(req)->syn_tos : np->tclass;
> +                               tcp_rsk(req)->syn_tos & ~INET_ECN_MASK :
> +                               np->tclass;
>                 if (!opt)
>                         opt = rcu_dereference(np->opt);
>                 err = ip6_xmit(sk, skb, fl6, sk->sk_mark, opt,
> -                              tclass & ~INET_ECN_MASK,
> -                              sk->sk_priority);
> +                              tclass, sk->sk_priority);
>                 rcu_read_unlock();
>                 err = net_xmit_eval(err);
>         }
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ