[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210824234700.qlteie6al3cldcu5@kafai-mbp>
Date: Tue, 24 Aug 2021 16:47:00 -0700
From: Martin KaFai Lau <kafai@...com>
To: Cong Wang <xiyou.wangcong@...il.com>
CC: <netdev@...r.kernel.org>, <bpf@...r.kernel.org>, <toke@...hat.com>,
Cong Wang <cong.wang@...edance.com>,
Jamal Hadi Salim <jhs@...atatu.com>,
Jiri Pirko <jiri@...nulli.us>
Subject: Re: [RFC Patch net-next] net_sched: introduce eBPF based Qdisc
On Fri, Aug 20, 2021 at 06:02:40PM -0700, Cong Wang wrote:
> From: Cong Wang <cong.wang@...edance.com>
>
> This *incomplete* patch introduces a programmable Qdisc with
> eBPF. The goal is to make Qdisc as programmable as possible,
> that is, to replace as many existing Qdisc's as we can. ;)
>
> The design was discussed during last LPC:
> https://linuxplumbersconf.org/event/7/contributions/679/attachments/520/1188/sch_bpf.pdf
>
> Here is a summary of design decisions I made:
>
> 1. Avoid eBPF struct_ops, as it would be really hard to program
> a Qdisc with this approach.
Please explain more on this. What is currently missing
to make qdisc in struct_ops possible?
> 2. Avoid exposing skb's to user-space, which means we can't introduce
> a map to store skb's. Instead, store them in kernel without exposure
> to user-space.
>
> So I choose to use priority queues to store skb's inside a
> flow and to store flows inside a Qdisc, and let eBPF programs
> decide the *relative* position of the skb within the flow and the
> *relative* order of the flows too, upon each enqueue and dequeue.
> Each flow is also exposed to user as a TC class, like many other
> classful Qdisc's.
>
> Although the biggest limitation is obviously that users can
> not traverse the packets or flows inside the Qdisc, I think
> at least they could store those global information of interest
> inside their own map and map can be shared between enqueue and
> dequeue. For example, users could use skb pointer as key and
> rank as a value to find out the absolute order.
>
> One of the challeges is how to interact with existing TC infra,
> for instance, if users install TC filters on this Qdisc, should
> we respect this by ignoring or rejecting eBPF enqueue program
> attached or vice versa? Should we allow users to replace each
> priority queue of a class with a regular Qdisc?
>
> Any high-level feedbacks are welcome. Please do not review any
> coding details until RFC tag is removed.
>
> Cc: Jamal Hadi Salim <jhs@...atatu.com>
> Cc: Jiri Pirko <jiri@...nulli.us>
> Signed-off-by: Cong Wang <cong.wang@...edance.com>
Powered by blists - more mailing lists