netdev - Re: RFC: Fixing SK_REUSEPORT from sk_lookup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CADa=RyyuAOupK7LOydQiNi6tx2ELOgD+bdu+DHh3xF0dDxw_gw@mail.gmail.com>
Date:   Wed, 15 May 2019 10:16:28 -0700
From:   Joe Stringer <joe@...valent.com>
To:     Lorenz Bauer <lmb@...udflare.com>
Cc:     Networking <netdev@...r.kernel.org>, bpf@...r.kernel.org
Subject: Re: RFC: Fixing SK_REUSEPORT from sk_lookup_* helpers

On Wed, May 15, 2019 at 8:11 AM Lorenz Bauer <lmb@...udflare.com> wrote:
>
> In the BPF-based TPROXY session with Joe Stringer [1], I mentioned
> that the sk_lookup_* helpers currently return inconsistent results if
> SK_REUSEPORT programs are in play.
>
> SK_REUSEPORT programs are a hook point in inet_lookup. They get access
> to the full packet
> that triggered the look up. To support this, inet_lookup gained a new
> skb argument to provide such context. If skb is NULL, the SK_REUSEPORT
> program is skipped and instead the socket is selected by its hash.
>
> The first problem is that not all callers to inet_lookup from BPF have
> an skb, e.g. XDP. This means that a look up from XDP gives an
> incorrect result. For now that is not a huge problem. However, once we
> get sk_assign as proposed by Joe, we can end up circumventing
> SK_REUSEPORT.

To clarify a bit, the reason this is a problem is that a
straightforward implementation may just consider passing the skb
context into the sk_lookup_*() and through to the inet_lookup() so
that it would run the SK_REUSEPORT BPF program for socket selection on
the skb when the packet-path BPF program performs the socket lookup.
However, as this paragraph describes, the skb context is not always
available.

> At the conference, someone suggested using a similar approach to the
> work done on the flow dissector by Stanislav: create a dedicated
> context sk_reuseport which can either take an skb or a plain pointer.
> Patch up load_bytes to deal with both. Pass the context to
> inet_lookup.
>
> This is when we hit the second problem: using the skb or XDP context
> directly is incorrect, because it assumes that the relevant protocol
> headers are at the start of the buffer. In our use case, the correct
> headers are at an offset since we're inspecting encapsulated packets.
>
> The best solution I've come up with is to steal 17 bits from the flags
> argument to sk_lookup_*, 1 bit for BPF_F_HEADERS_AT_OFFSET, 16bit for
> the offset itself.

FYI there's also the upper 32 bits of the netns_id parameter, another
option would be to steal 16 bits from there.

> Thoughts?

Internally with skbs, we use `skb_pull()` to manage header offsets,
could we do something similar with `bpf_xdp_adjust_head()` prior to
the call to `bpf_sk_lookup_*()`?