[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20260108-mq-cake-sub-qdisc-v7-0-4eb645f0419c@redhat.com>
Date: Thu, 08 Jan 2026 17:56:02 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Toke Høiland-Jørgensen <toke@...e.dk>,
Jamal Hadi Salim <jhs@...atatu.com>, Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>
Cc: Jonas Köppeler <j.koeppeler@...berlin.de>,
cake@...ts.bufferbloat.net, netdev@...r.kernel.org,
Willem de Bruijn <willemb@...gle.com>,
Toke Høiland-Jørgensen <toke@...hat.com>,
Victor Nogueira <victor@...atatu.com>
Subject: [PATCH net-next v7 0/6] Multi-queue aware sch_cake
This series adds a multi-queue aware variant of the sch_cake scheduler,
called 'cake_mq'. Using this makes it possible to scale the rate shaper
of sch_cake across multiple CPUs, while still enforcing a single global
rate on the interface.
The approach taken in this patch series is to implement a separate qdisc
called 'cake_mq', which is based on the existing 'mq' qdisc, but differs
in a couple of aspects:
- It will always install a cake instance on each hardware queue (instead
of using the default qdisc for each queue like 'mq' does).
- The cake instances on the queues will share their configuration, which
can only be modified through the parent cake_mq instance.
Doing things this way simplifies user configuration by centralising
all configuration through the cake_mq qdisc (which also serves as an
obvious way of opting into the multi-queue aware behaviour). The cake_mq
qdisc takes all the same configuration parameters as the cake qdisc.
An earlier version of this work was presented at this year's Netdevconf:
https://netdevconf.info/0x19/sessions/talk/mq-cake-scaling-software-rate-limiting-across-cpu-cores.html
The patch series is structured as follows:
- Patch 1 exports the mq qdisc functions for reuse.
- Patch 2 factors out the sch_cake configuration variables into a
separate struct that can be shared between instances.
- Patch 3 adds the basic cake_mq qdisc, reusing the exported mq code
- Patch 4 adds configuration sharing across the cake instances installed
under cake_mq
- Patch 5 adds the shared shaper state that enables the multi-core rate
shaping
- Patch 6 adds selftests for cake_mq
A patch to iproute2 to make it aware of the cake_mq qdisc were submitted
separately with a previous patch version:
https://lore.kernel.org/r/20260105162902.1432940-1-toke@redhat.com
---
Changes in v7:
- Use kzalloc() instead of kvcalloc(1, ...)
- Add missing SoB to last patch
- Add new include/linux/sch_priv.h header file to MAINTAINERS entry for TC
- Link to v6: https://lore.kernel.org/r/20260106-mq-cake-sub-qdisc-v6-0-ee2e06b1eb1a@redhat.com
Changes in v6:
- Add missing teardown command in last selftest
- Link to v5: https://lore.kernel.org/r/20260105-mq-cake-sub-qdisc-v5-0-8a99b9db05e6@redhat.com
Changes in v5:
- Disallow using autorate-ingress with cake_mq
- Lock each child in cake_mq_change() instead of the parent
- Move mq exports into its own header file and export them with EXPORT_SYMBOL_NS_GPL
- Add selftests
- Link to v4: https://lore.kernel.org/r/20251201-mq-cake-sub-qdisc-v4-0-50dd3211a1c6@redhat.com
Changes in v4:
- A bunch of bot nits:
- Fix null pointer deref in cake_destroy()
- Unwind qdisc registration on failure
- Use rcu_dereference() instead of rtnl_dereference() in data path
- Use WRITE_ONCE() for q->last_active
- Store num_active_qs to stats value after computing it
- Link to v3: https://lore.kernel.org/r/20251130-mq-cake-sub-qdisc-v3-0-5f66c548ecdc@redhat.com
Changes in v3:
- Export the functions from sch_mq and reuse them instead of copy-pasting
- Dropped Jamal's reviewed-by on the patches that changed due to the above
- Fixed a crash if cake_mq_init is called with a NULL opt parameter
- Link to v2: https://lore.kernel.org/r/20251127-mq-cake-sub-qdisc-v2-0-24d9ead047b9@redhat.com
Changes in v2:
- Rebase on top of net-next, incorporating Eric's changes
- Link to v1: https://lore.kernel.org/r/20251124-mq-cake-sub-qdisc-v1-0-a2ff1dab488f@redhat.com
Changes in v1 (since RFC):
- Drop the sync_time parameter for now and always use the 200 us value.
We are planning to explore auto-configuration of the sync time, so
this is to avoid committing to a UAPI. If needed, a parameter can be
added back later.
- Keep the tc yaml spec in sync with the new stats member
- Rebase on net-next
- Link to RFC: https://lore.kernel.org/r/20250924-mq-cake-sub-qdisc-v1-0-43a060d1112a@redhat.com
---
Jonas Köppeler (2):
net/sched: sch_cake: share shaper state across sub-instances of cake_mq
selftests/tc-testing: add selftests for cake_mq qdisc
Toke Høiland-Jørgensen (4):
net/sched: Export mq functions for reuse
net/sched: sch_cake: Factor out config variables into separate struct
net/sched: sch_cake: Add cake_mq qdisc for using cake on mq devices
net/sched: sch_cake: Share config across cake_mq sub-qdiscs
Documentation/netlink/specs/tc.yaml | 3 +
MAINTAINERS | 1 +
include/net/sch_priv.h | 27 +
include/uapi/linux/pkt_sched.h | 1 +
net/sched/sch_cake.c | 514 ++++++++++++++-----
net/sched/sch_mq.c | 71 ++-
.../tc-testing/tc-tests/qdiscs/cake_mq.json | 559 +++++++++++++++++++++
7 files changed, 1018 insertions(+), 158 deletions(-)
---
base-commit: 76de4e1594b7dfacba549e9db60585811f45dbe5
change-id: 20250902-mq-cake-sub-qdisc-cdf0b59d2fe5
Powered by blists - more mailing lists