[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f038645a-cb8a-dc59-e57e-2544a259bab1@mojatatu.com>
Date: Fri, 18 Jun 2021 07:40:13 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Daniel Borkmann <daniel@...earbox.net>,
Kumar Kartikeya Dwivedi <memxor@...il.com>
Cc: Cong Wang <xiyou.wangcong@...il.com>, bpf <bpf@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>, Vlad Buslov <vladbu@...dia.com>,
Jiri Pirko <jiri@...nulli.us>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Joe Stringer <joe@...ium.io>,
Quentin Monnet <quentin@...valent.com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Toke Høiland-Jørgensen <toke@...hat.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
Marcelo Ricardo Leitner <mleitner@...hat.com>
Subject: Re: [PATCH RFC bpf-next 0/7] Add bpf_link based TC-BPF API
On 2021-06-16 12:00 p.m., Daniel Borkmann wrote:
> On 6/16/21 5:32 PM, Kumar Kartikeya Dwivedi wrote:
>> On Wed, Jun 16, 2021 at 08:10:55PM IST, Jamal Hadi Salim wrote:
>>> On 2021-06-15 7:07 p.m., Daniel Borkmann wrote:
>>>> On 6/13/21 11:10 PM, Jamal Hadi Salim wrote:
[..]
>>>
>>> In particular, here's a list from Kartikeya's implementation:
>>>
>>> 1) Direct action mode only
>
> (More below.)
>
>>> 2) Protocol ETH_P_ALL only
>
> The issue I see with this one is that it's not very valuable or useful
> from a BPF
> point of view. Meaning, this kind of check can and typically is
> implemented from
> BPF program anyway. For example, when you have direct packet access
> initially
> parsing the eth header anyway (and from there having logic for the
> various eth
> protos).
In that case make it optional to specify proto and default it to
ETH_P_ALL. As far as i can see this flexibility doesnt
complicate usability or add code complexity to the interfaces.
>
> That protocol option is maybe more useful when you have classic tc with
> cls+act
> style pipeline where you want a quick skip of classifiers to avoid
> reparsing the
> packet. Given you can do everything inside the BPF program already it
> adds more
> confusion than value for a simple libbpf [tc/BPF] API.
>
There's no point in repeating an operation of identifying
the protocol type which can/has already be Id-ed by the calling
(into ebpf) code. If all i am interested in is IPv4, then
my ebpf parser can be simplified if i am sure i can assume it
is an IPv4 packet.
[..]
>>> 1) We use non-DA mode, so i cant live without that (and frankly ebpf
>>> has challenges adding complex code blocks).
>
> Could you elaborate on that or provide code examples? Since introduction
> of the
> direct action mode I've never used anything else again, and we do have
> complex
> BPF code blocks that we need to handle as well. Would be good if you
> could provide
> more details on things you ran into, maybe they can be solved?
>
Main issue is code complexity in ebpf and not so much instruction
count (which is complicated once you have bounded loops).
Earlier, I tried to post on the ebpf list but i got no response.
I moved on since. I would like to engage you at some point - and
you are right there may be some clever tricks to achieve the goals
we had. The challenge is in keeping up with the bag of tricks to make
the verifier happy.
Being able to run non-da mode and for example attach an action such
as the policer (and others) has pragmatic uses. It would be quiet
complex to implement the policer within an all-in-one-appliance
da-mode ebpf code.
One approach is to add more helpers to invoke such code directly
from ebpf - but we have some restrictions; the deployment is RHEL8.3
based and we have to live with the kernel features supported there.
i.e kernel upgrade is a no-no. Given all these TC features have
existed (and stable) for 100 years it make a lot of sense to use them.
We are going to present some of the challenges we faced in a subset
of our work in an approach to replace iptables at netdev 0x15
(hopefully we get accepted).
cheers,
jamal
Powered by blists - more mailing lists