netdev - Re: [PATCH v2 bpf-next 2/9] bpf: Add bpf helper bpf_tcp_enter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190224044455.ksagj4g2st477imn@ast-mbp.dhcp.thefacebook.com>
Date:   Sat, 23 Feb 2019 20:44:56 -0800
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Martin Lau <kafai@...com>
Cc:     Eric Dumazet <eric.dumazet@...il.com>,
        Lawrence Brakmo <brakmo@...com>,
        netdev <netdev@...r.kernel.org>, Alexei Starovoitov <ast@...com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH v2 bpf-next 2/9] bpf: Add bpf helper bpf_tcp_enter_cwr

On Sun, Feb 24, 2019 at 03:08:48AM +0000, Martin Lau wrote:
> On Sat, Feb 23, 2019 at 05:32:14PM -0800, Eric Dumazet wrote:
> > 
> > 
> > On 02/22/2019 05:06 PM, brakmo wrote:
> > > From: Martin KaFai Lau <kafai@...com>
> > > 
> > > This patch adds a new bpf helper BPF_FUNC_tcp_enter_cwr
> > > "int bpf_tcp_enter_cwr(struct bpf_tcp_sock *tp)".
> > > It is added to BPF_PROG_TYPE_CGROUP_SKB which can be attached
> > > to the egress path where the bpf prog is called by
> > > ip_finish_output() or ip6_finish_output().  The verifier
> > > ensures that the parameter must be a tcp_sock.
> > > 
> > > This helper makes a tcp_sock enter CWR state.  It can be used
> > > by a bpf_prog to manage egress network bandwidth limit per
> > > cgroupv2.  A later patch will have a sample program to
> > > show how it can be used to limit bandwidth usage per cgroupv2.
> > > 
> > > To ensure it is only called from BPF_CGROUP_INET_EGRESS, the
> > > attr->expected_attach_type must be specified as BPF_CGROUP_INET_EGRESS
> > > during load time if the prog uses this new helper.
> > > The newly added prog->enforce_expected_attach_type bit will also be set
> > > if this new helper is used.  This bit is for backward compatibility reason
> > > because currently prog->expected_attach_type has been ignored in
> > > BPF_PROG_TYPE_CGROUP_SKB.  During attach time,
> > > prog->expected_attach_type is only enforced if the
> > > prog->enforce_expected_attach_type bit is set.
> > > i.e. prog->expected_attach_type is only enforced if this new helper
> > > is used by the prog.
> > > 
> > 
> > BTW, it seems to me that BPF_CGROUP_INET_EGRESS can be used while the socket lock is not held.
> Thanks for pointing it out.
> 
> ic. I just noticed the comments at ip6_xmit():
> /*
>  * xmit an sk_buff (used by TCP, SCTP and DCCP)
>  * Note : socket lock is not held for SYNACK packets, but might be modified
>  * by calls to skb_set_owner_w() and ipv6_local_error(),
>  * which are using proper atomic operations or spinlocks.
>  */
> Is there other cases other than SYNACK?

I don't think it's a problem.
the helper does:
BPF_CALL_1(bpf_tcp_enter_cwr, struct tcp_sock *, tp)
+{
+	struct sock *sk = (struct sock *)tp;
+
+	if (sk->sk_state == TCP_ESTABLISHED) {
+		tcp_enter_cwr(sk);

I believe at the time ip_finish_output is called on established socket
it's safe to call tcp_enter_cwr.
I don't see how this is different from normal __tcp_transmit_skb path.

Eric, what issue do you see?