lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Jun 2021 18:00:09 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Kumar Kartikeya Dwivedi <memxor@...il.com>,
        Jamal Hadi Salim <jhs@...atatu.com>
Cc:     Cong Wang <xiyou.wangcong@...il.com>, bpf <bpf@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>, Vlad Buslov <vladbu@...dia.com>,
        Jiri Pirko <jiri@...nulli.us>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, Joe Stringer <joe@...ium.io>,
        Quentin Monnet <quentin@...valent.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Toke Høiland-Jørgensen <toke@...hat.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Marcelo Ricardo Leitner <mleitner@...hat.com>
Subject: Re: [PATCH RFC bpf-next 0/7] Add bpf_link based TC-BPF API

On 6/16/21 5:32 PM, Kumar Kartikeya Dwivedi wrote:
> On Wed, Jun 16, 2021 at 08:10:55PM IST, Jamal Hadi Salim wrote:
>> On 2021-06-15 7:07 p.m., Daniel Borkmann wrote:
>>> On 6/13/21 11:10 PM, Jamal Hadi Salim wrote:
>>
>> [..]
>>
>>>> I look at it from the perspective that if i can run something with
>>>> existing tc loading mechanism then i should be able to do the same
>>>> with the new (libbpf) scheme.
>>>
>>> The intention is not to provide a full-blown tc library (that could be
>>> subject to a
>>> libtc or such), but rather to only have libbpf abstract the tc related
>>> API that is
>>> most /relevant/ for BPF program development and /efficient/ in terms of
>>> execution in
>>> fast-path while at the same time providing a good user experience from
>>> the API itself.
>>>
>>> That is, simple to use and straight forward to explain to folks with
>>> otherwise zero
>>> experience of tc. The current implementation does all that, and from
>>> experience with
>>> large BPF programs managed via cls_bpf that is all that is actually
>>> needed from tc
>>> layer perspective. The ability to have multi programs (incl. priorities)
>>> is in the
>>> existing libbpf API as well.
>>
>> Which is a fair statement, but if you take away things that work fine
>> with current iproute2 loading I have no motivation to migrate at all.
>> Its like that saying of "throwing out the baby with the bathwater".
>> I want my baby.
>>
>> In particular, here's a list from Kartikeya's implementation:
>>
>> 1) Direct action mode only

(More below.)

>> 2) Protocol ETH_P_ALL only

The issue I see with this one is that it's not very valuable or useful from a BPF
point of view. Meaning, this kind of check can and typically is implemented from
BPF program anyway. For example, when you have direct packet access initially
parsing the eth header anyway (and from there having logic for the various eth
protos).

That protocol option is maybe more useful when you have classic tc with cls+act
style pipeline where you want a quick skip of classifiers to avoid reparsing the
packet. Given you can do everything inside the BPF program already it adds more
confusion than value for a simple libbpf [tc/BPF] API.

>> 3) Only at chain 0
>> 4) No block support
> 
> Block is supported, you just need to set TCM_IFINDEX_MAGIC_BLOCK as ifindex and
> parent as the block index. There isn't anything more to it than that from libbpf
> side (just specify BPF_TC_CUSTOM enum).
> 
> What I meant was that hook_create doesn't support specifying the ingress/egress
> block when creating clsact, but that typically isn't a problem because qdiscs
> for shared blocks would be set up together prior to the attachment anyway.
> 
>> I think he said priority is supported but was also originally on that
>> list.
>> When we discussed at the meetup it didnt seem these cost anything
>> in terms of code complexity or usability of the API.
>>
>> 1) We use non-DA mode, so i cant live without that (and frankly ebpf
>> has challenges adding complex code blocks).

Could you elaborate on that or provide code examples? Since introduction of the
direct action mode I've never used anything else again, and we do have complex
BPF code blocks that we need to handle as well. Would be good if you could provide
more details on things you ran into, maybe they can be solved?

>> 2) We also use different protocols when i need to
>> (yes, you can do the filtering in the bpf code - but why impose that
>> if the cost of adding it is simple? and of course it is cheaper to do
>> the check outside of ebpf)
>> 3) We use chains outside of zero
>>
>> 4) So far we dont use block support but certainly my recent experiences
>> in a deployment shows that we need to group netdevices more often than
>> i thought was necessary. So if i could express one map shared by
>> multiple netdevices it should cut down the user space complexity.

Thanks,
Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ