[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <D55885DE-EDF5-456C-B4E4-B9D27C6C75BA@fb.com>
Date: Tue, 19 Feb 2019 22:31:37 +0000
From: Lawrence Brakmo <brakmo@...com>
To: Eric Dumazet <eric.dumazet@...il.com>,
netdev <netdev@...r.kernel.org>
CC: Martin Lau <kafai@...com>, Alexei Starovoitov <ast@...com>,
"Daniel Borkmann" <daniel@...earbox.net>
Subject: Re: [PATCH bpf-next 7/9] bpf: Sample NRM BPF program to limit egress
bw
On 2/19/19, 10:29 AM, "netdev-owner@...r.kernel.org on behalf of Eric Dumazet" <netdev-owner@...r.kernel.org on behalf of eric.dumazet@...il.com> wrote:
On 02/18/2019 09:38 PM, brakmo wrote:
> +
> +static __always_inline void get_nrm_pkt_info(struct bpf_sock *sk,
> + struct nrm_pkt_info *pkti)
> +{
> + if (sk->family == AF_INET6 || sk->family == AF_INET) {
> + pkti->is_ip = true;
> + pkti->is_tcp = (sk->protocol == IPPROTO_TCP);
> + if (pkti->is_tcp) {
> + struct bpf_tcp_sock *tp;
> +
> + tp = bpf_tcp_sock(sk);
> + if (tp)
> + pkti->ecn = tp->ecn_flags & TCP_ECN_OK;
> + else
> + pkti->ecn = 0;
> + } else {
> + pkti->ecn = 0;
> + }
> + } else {
> + pkti->is_ip = false;
> + pkti->is_tcp = false;
> + pkti->ecn = 0;
> + }
> +}
>
This looks very strange.
ECN capability is per packet, and does not need access to the original
TCP socket really.
We definitely can use ECN with UDP packets.
IMO this sample looks like a work in progress.
This sample NRM program focuses on TCP, I should have made that more explicit. I originally was checking the ECN bits on the packet, but someone pointed out that since I was focusing on TCP, it would be more efficient to just look at the TCP state (saving having to read the packet header). However, your point is correct in that I could be wrongly marking packets that I should not mark, such as pure ACKs. I will go back to the previous version that looks at the ECN bits in the header and not limit ECN marking to just TCP.
EDT model allows to implement full shaping (not only virtual one)
by twaking/advancing skb->tstamp
(see commit f11216b24219a bpf: add skb->tstamp r/w access from tc clsact and cg skb progs)
I have a version of an NRM BPF program that uses EDT, so I know what you mean. However, I decided to send it as a separate patch (including an ingress sample program).
Implementing shaping with the need of accessing TCP sockets seems a layering violation,
a lot of things can go wrong with this model.
For instance, you wont be able to offload this.
On the other hand, there are also advantages to having access to the TCP socket. For example, one has more tools to affect the TCP rate (like the proposed helper bpf_tcp_enter_cwr() which does not require dropping the packet). We will need more experience to fully understand the tradeoffs between the various implementations of rate limiting and/or shaping.
Thank you for your feedback.
Powered by blists - more mailing lists