lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMm45HX=zd1qMThugYRGA9bugM-OT9NPx++VWj_zYowDmQ@mail.gmail.com>
Date: Wed, 24 Jan 2024 09:11:39 -0500
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Daniel Borkmann <daniel@...earbox.net>
Cc: Stanislav Fomichev <sdf@...gle.com>, Amery Hung <ameryhung@...il.com>, netdev@...r.kernel.org, 
	bpf@...r.kernel.org, yangpeihao@...u.edu.cn, toke@...hat.com, 
	jiri@...nulli.us, xiyou.wangcong@...il.com, yepeilin.cs@...il.com
Subject: Re: [RFC PATCH v7 0/8] net_sched: Introduce eBPF based Qdisc

On Wed, Jan 24, 2024 at 8:08 AM Daniel Borkmann <daniel@...earbox.net> wrote:
>
> On 1/24/24 1:09 PM, Jamal Hadi Salim wrote:
> > On Tue, Jan 23, 2024 at 4:13 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> >> On 01/17, Amery Hung wrote:
> >>> Hi,
> >>>
> >>> I am continuing the work of ebpf-based Qdisc based on Cong’s previous
> >>> RFC. The followings are some use cases of eBPF Qdisc:
> >>>
> >>> 1. Allow customizing Qdiscs in an easier way. So that people don't
> >>>     have to write a complete Qdisc kernel module just to experiment
> >>>     some new queuing theory.
> >>>
> >>> 2. Solve EDT's problem. EDT calcuates the "tokens" in clsact which
> >>>     is before enqueue, it is impossible to adjust those "tokens" after
> >>>     packets get dropped in enqueue. With eBPF Qdisc, it is easy to
> >>>     be solved with a shared map between clsact and sch_bpf.
> >>>
> >>> 3. Replace qevents, as now the user gains much more control over the
> >>>     skb and queues.
> >>>
> >>> 4. Provide a new way to reuse TC filters. Currently TC relies on filter
> >>>     chain and block to reuse the TC filters, but they are too complicated
> >>>     to understand. With eBPF helper bpf_skb_tc_classify(), we can invoke
> >>>     TC filters on _any_ Qdisc (even on a different netdev) to do the
> >>>     classification.
> >>>
> >>> 5. Potentially pave a way for ingress to queue packets, although
> >>>     current implementation is still only for egress.
> >>>
> >>> I’ve combed through previous comments and appreciated the feedbacks.
> >>> Some major changes in this RFC is the use of kptr to skb to maintain
> >>> the validility of skb during its lifetime in the Qdisc, dropping rbtree
> >>> maps, and the inclusion of two examples.
> >>>
> >>> Some questions for discussion:
> >>>
> >>> 1. We now pass a trusted kptr of sk_buff to the program instead of
> >>>     __sk_buff. This makes most helpers using __sk_buff incompatible
> >>>     with eBPF qdisc. An alternative is to still use __sk_buff in the
> >>>     context and use bpf_cast_to_kern_ctx() to acquire the kptr. However,
> >>>     this can only be applied to enqueue program, since in dequeue program
> >>>     skbs do not come from ctx but kptrs exchanged out of maps (i.e., there
> >>>     is no __sk_buff). Any suggestion for making skb kptr and helper
> >>>     functions compatible?
> >>>
> >>> 2. The current patchset uses netlink. Do we also want to use bpf_link
> >>>     for attachment?
> >>
> >> [..]
> >>
> >>> 3. People have suggested struct_ops. We chose not to use struct_ops since
> >>>     users might want to create multiple bpf qdiscs with different
> >>>     implementations. Current struct_ops attachment model does not seem
> >>>     to support replacing only functions of a specific instance of a module,
> >>>     but I might be wrong.
> >>
> >> I still feel like it deserves at leasta try. Maybe we can find some potential
> >> path where struct_ops can allow different implementations (Martin probably
> >> has some ideas about that). I looked at the bpf qdisc itself and it doesn't
> >> really have anything complicated (besides trying to play nicely with other
> >> tc classes/actions, but I'm not sure how relevant that is).
> >
> > Are you suggesting that it is a nuisance to integrate with the
> > existing infra? I would consider it being a lot more than "trying to
> > play nicely". Besides, it's a kfunc and people will not be forced to
> > use it.
>
> What's the use case?

What's the use case for enabling existing infra to work? Sure, let's
rewrite everything from scratch in ebpf. And then introduce new
tooling which well funded companies are capable of owning the right
resources to build and manage. Open source is about choices and
sharing and this is about choices and sharing.

> If you already go that route to implement your own
> qdisc, why is there a need to take the performane hit and go all the
> way into old style cls/act infra when it can be done in a more straight
> forward way natively?

Who is forcing you to use the kfunc? This is about choice.
What is ebpf these days anyways? Is it a) a programming environment or
b) is it the only way to do things? I see it as available infra i.e #a
not as the answer looking for a question.  IOW, as something we can
use to build the infra we need and use kfuncs when it makes sense. Not
everybody has infinite resources to keep hacking things into ebpf.

> For the vast majority of cases this will be some
> very lightweight classification anyway (if not outsourced to tc egress
> given the lock). If there is a concrete production need, it could be
> added, otherwise if there is no immediate use case which cannot be solved
> otherwise I would not add unnecessary kfuncs.

"Unnecessary" is really your view.

cheers,
jamal

> Cheers,
> Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ