[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <875ytrga3p.fsf@toke.dk>
Date: Wed, 20 Oct 2021 14:21:46 +0200
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Florian Westphal <fw@...len.de>
Cc: Florian Westphal <fw@...len.de>,
Kumar Kartikeya Dwivedi <memxor@...il.com>,
Maxim Mikityanskiy <maximmi@...dia.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>,
Eric Dumazet <edumazet@...gle.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
David Ahern <dsahern@...nel.org>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Nathan Chancellor <nathan@...nel.org>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Brendan Jackman <jackmanb@...gle.com>,
Florent Revest <revest@...omium.org>,
Joe Stringer <joe@...ium.io>,
Lorenz Bauer <lmb@...udflare.com>,
Tariq Toukan <tariqt@...dia.com>, netdev@...r.kernel.org,
bpf@...r.kernel.org, clang-built-linux@...glegroups.com
Subject: Re: [PATCH bpf-next 07/10] bpf: Add helpers to query conntrack info
Florian Westphal <fw@...len.de> writes:
> Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>> Florian Westphal <fw@...len.de> writes:
>>
>> > Kumar Kartikeya Dwivedi <memxor@...il.com> wrote:
>> >> On Tue, Oct 19, 2021 at 08:16:52PM IST, Maxim Mikityanskiy wrote:
>> >> > The new helpers (bpf_ct_lookup_tcp and bpf_ct_lookup_udp) allow to query
>> >> > connection tracking information of TCP and UDP connections based on
>> >> > source and destination IP address and port. The helper returns a pointer
>> >> > to struct nf_conn (if the conntrack entry was found), which needs to be
>> >> > released with bpf_ct_release.
>> >> >
>> >> > Signed-off-by: Maxim Mikityanskiy <maximmi@...dia.com>
>> >> > Reviewed-by: Tariq Toukan <tariqt@...dia.com>
>> >>
>> >> The last discussion on this [0] suggested that stable BPF helpers for conntrack
>> >> were not desired, hence the recent series [1] to extend kfunc support to modules
>> >> and base the conntrack work on top of it, which I'm working on now (supporting
>> >> both CT lookup and insert).
>> >
>> > This will sabotage netfilter pipeline and the way things work more and
>> > more 8-(
>>
>> Why?
>
> Lookups should be fine. Insertions are the problem.
>
> NAT hooks are expected to execute before the insertion into the
> conntrack table.
>
> If you insert before, NAT hooks won't execute, i.e.
> rules that use dnat/redirect/masquerade have no effect.
Well yes, if you insert the wrong state into the conntrack table, you're
going to get wrong behaviour. That's sorta expected, there are lots of
things XDP can do to disrupt the packet flow (like just dropping the
packets :)).
>> > If you want to use netfilter with ebpf, please have a look at the RFC
>> > I posted and lets work on adding a netfilter specific program type
>> > that can run ebpf programs directly from any of the existing netfilter
>> > hook points.
>>
>> Accelerating netfilter using BPF is a worthy goal in itself, but I also
>> think having the ability to lookup into conntrack from XDP is useful for
>> cases where someone wants to bypass the stack entirely (for accelerating
>> packet forwarding, say). I don't think these goals are in conflict
>> either, what makes you say they are?
>
> Lookup is fine, I don't see fundamental issues with XDP-based bypass,
> there are flowtables that also bypass classic forward path via the
> netfilter ingress hook (first packet needs to go via classic path to
> pass through all filter + nat rules and is offlloaded to HW or SW via
> the 'flow add' statement in nftables.
>
> I don't think there is anything that stands in the way of replicating
> this via XDP.
What I want to be able to do is write an XDP program that does the following:
1. Parse the packet header and determine if it's a packet type we know
how to handle. If not, just return XDP_PASS and let the stack deal
with corner cases.
2. If we know how to handle the packet (say, it's TCP or UDP), do a
lookup into conntrack to figure out if there's state for it and we
need to do things like NAT.
3. If we need to NAT, rewrite the packet based on the information we got
back from conntrack.
4. Update the conntrack state to be consistent with the packet, and then
redirect it out the destination interface.
I.e., in the common case the packet doesn't go through the stack at all;
but we need to make conntrack aware that we processed the packet so the
entry doesn't expire (and any state related to the flow gets updated).
Ideally we should also be able to create new state for a flow we haven't
seen before.
This requires updating of state, but I see no reason why this shouldn't
be possible?
-Toke
Powered by blists - more mailing lists