[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S35BhiY7DVPetEDrLwgsR3sKTcm-bJ6og-6N329Znavkww@mail.gmail.com>
Date: Tue, 1 Nov 2016 09:17:04 -0700
From: Tom Herbert <tom@...bertland.com>
To: Thomas Graf <tgraf@...g.ch>
Cc: "David S. Miller" <davem@...emloft.net>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
roopa <roopa@...ulusnetworks.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next v2 0/5] bpf: BPF for lightweight tunnel encapsulation
On Mon, Oct 31, 2016 at 5:37 PM, Thomas Graf <tgraf@...g.ch> wrote:
> {Open question:
> Tom brought up the question on whether it is safe to modify the packet
> in artbirary ways before dst_output(). This is the equivalent to a raw
> socket injecting illegal headers. This v2 currently assumes that
> dst_output() is ready to accept invalid header values. This needs to be
> verified and if not the case, then raw sockets or dst_output() handlers
> must be fixed as well. Another option is to mark lwtunnel_output() as
> read-only for now.}
>
The question might not be so much about illegal headers but whether
fields in the skbuff related to the packet contents are kept correct.
We have protocol, header offsets, offsets for inner protocols also,
encapsulation settings, checksum status, checksum offset, checksum
complete value, vlan information. Any or all of which I believe could
be turned into being incorrect if we allow the packet to be
arbitrarily modified by BPF. This problem is different than raw
sockets because LWT operates in the middle of the stack, the skbuff
has already been set up which such things.
> This series implements BPF program invocation from dst entries via the
> lightweight tunnels infrastructure. The BPF program can be attached to
> lwtunnel_input(), lwtunnel_output() or lwtunnel_xmit() and sees an L3
> skb as context. input is read-only, output can write, xmit can write,
> push headers, and redirect.
>
> Motiviation for this work:
> - Restricting outgoing routes beyond what the route tuple supports
> - Per route accounting byond realms
> - Fast attachment of L2 headers where header does not require resolving
> L2 addresses
> - ILA like uses cases where L3 addresses are resolved and then routed
> in an async manner
> - Fast encapsulation + redirect. For now limited to use cases where not
> setting inner and outer offset/protocol is OK.
>
Is checksum offload supported? By default, at least for Linux, we
offload the outer UDP checksum in VXLAN and the other UDP
encapsulations for performance.
Tom
> A couple of samples on how to use it can be found in patch 04.
>
> v1 -> v2:
> - Added new BPF_LWT_REROUTE return code for program to indicate
> that new route lookup should be performed. Suggested by Tom.
> - New sample to illustrate rerouting
> - New patch 05: Recursion limit for lwtunnel_output for the case
> when user creates circular dst redirection. Also resolves the
> issue for ILA.
> - Fix to ensure headroom for potential future L2 header is still
> guaranteed
>
> Thomas Graf (5):
> route: Set orig_output when redirecting to lwt on locally generated
> traffic
> route: Set lwtstate for local traffic and cached input dsts
> bpf: BPF for lightweight tunnel encapsulation
> bpf: Add samples for LWT-BPF
> lwtunnel: Limit number of recursions on output to 5
>
> include/linux/filter.h | 2 +-
> include/uapi/linux/bpf.h | 37 +++-
> include/uapi/linux/lwtunnel.h | 21 ++
> kernel/bpf/verifier.c | 16 +-
> net/Kconfig | 1 +
> net/core/Makefile | 2 +-
> net/core/filter.c | 148 ++++++++++++-
> net/core/lwt_bpf.c | 504 ++++++++++++++++++++++++++++++++++++++++++
> net/core/lwtunnel.c | 15 +-
> net/ipv4/route.c | 37 +++-
> samples/bpf/bpf_helpers.h | 4 +
> samples/bpf/lwt_bpf.c | 235 ++++++++++++++++++++
> samples/bpf/test_lwt_bpf.sh | 370 +++++++++++++++++++++++++++++++
> 13 files changed, 1373 insertions(+), 19 deletions(-)
> create mode 100644 net/core/lwt_bpf.c
> create mode 100644 samples/bpf/lwt_bpf.c
> create mode 100755 samples/bpf/test_lwt_bpf.sh
>
> --
> 2.7.4
>
Powered by blists - more mailing lists