[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpUS_71R7wujqhUnF41dtVtNj=5kXcdAHea1euhESbeJrg@mail.gmail.com>
Date: Fri, 11 Dec 2020 11:16:03 -0800
From: Cong Wang <xiyou.wangcong@...il.com>
To: Maxim Mikityanskiy <maximmi@...lanox.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Jamal Hadi Salim <jhs@...atatu.com>,
Jiri Pirko <jiri@...nulli.us>,
Saeed Mahameed <saeedm@...dia.com>,
Jakub Kicinski <kuba@...nel.org>,
Tariq Toukan <tariqt@...lanox.com>,
Maxim Mikityanskiy <maximmi@...dia.com>,
Dan Carpenter <dan.carpenter@...cle.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
Tariq Toukan <tariqt@...dia.com>
Subject: Re: [PATCH net-next v2 2/4] sch_htb: Hierarchical QoS hardware offload
On Fri, Dec 11, 2020 at 7:26 AM Maxim Mikityanskiy <maximmi@...lanox.com> wrote:
>
> HTB doesn't scale well because of contention on a single lock, and it
> also consumes CPU. This patch adds support for offloading HTB to
> hardware that supports hierarchical rate limiting.
>
> This solution addresses two main problems of scaling HTB:
>
> 1. Contention by flow classification. Currently the filters are attached
> to the HTB instance as follows:
I do not think this is the reason, tcf_classify() has been called with RCU
only on the ingress side for a rather long time. What contentions are you
talking about here?
>
> # tc filter add dev eth0 parent 1:0 protocol ip flower dst_port 80
> classid 1:10
>
> It's possible to move classification to clsact egress hook, which is
> thread-safe and lock-free:
>
> # tc filter add dev eth0 egress protocol ip flower dst_port 80
> action skbedit priority 1:10
>
> This way classification still happens in software, but the lock
> contention is eliminated, and it happens before selecting the TX queue,
> allowing the driver to translate the class to the corresponding hardware
> queue.
Sure, you can use clsact with HTB, or any combinations you like, but you
can't assume your HTB only works with clsact, can you?
>
> Note that this is already compatible with non-offloaded HTB and doesn't
> require changes to the kernel nor iproute2.
>
> 2. Contention by handling packets. HTB is not multi-queue, it attaches
> to a whole net device, and handling of all packets takes the same lock.
> When HTB is offloaded, its algorithm is done in hardware. HTB registers
> itself as a multi-queue qdisc, similarly to mq: HTB is attached to the
> netdev, and each queue has its own qdisc. The control flow is still done
> by HTB: it calls the driver via ndo_setup_tc to replicate the hierarchy
> of classes in the NIC. Leaf classes are presented by hardware queues.
> The data path works as follows: a packet is classified by clsact, the
> driver selects a hardware queue according to its class, and the packet
> is enqueued into this queue's qdisc.
I do _not_ read your code, from what you describe here, it sounds like
you just want a per-queue rate limit, instead of a global one. So why
bothering HTB whose goal is a global rate limit?
And doesn't TBF already work with mq? I mean you can attach it as
a leaf to each mq so that the tree lock will not be shared either, but you'd
lose the benefits of a global rate limit too. EDT does basically the same,
but it never claims to completely replace HTB. ;)
Thanks.
Powered by blists - more mailing lists