[<prev] [next>] [day] [month] [year] [list]
Message-ID: <aLGuuBQPpOqZWJuD@poste8964.lip6.fr>
Date: Fri, 29 Aug 2025 15:44:24 +0200
From: Nicolas Peugnet <nicolas@...b1.fr>
To: netdev@...r.kernel.org
Cc: Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>, Jiri Pirko <jiri@...nulli.us>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>
Subject: [RFC PATCH] net/sched: make drr play nicely with non-work-conserving
qdiscs
Hi, this is my first time trying to contribute to Linux, so I will
probably make a lot of mistakes, apologies in advance.
I wanted to use a classful qdisc to set multiple netem qdiscs on an
iterface with different parameters and filters to classify packets.
Among all classful qdiscs, PRIO is the only one until now that plays
nicely with inner non-work-conserving qdiscs, but it is limited to 16
bands/classes and it has some level of integrated priorisation logic.
DRR on the other hand is the closest qdisc we have to a fully generic
classful qdisc without any integrated logic, but it currently does not
work correctly with non-work-conserving qdiscs.
This patch allows DRR to play nicely with inner non-work-conserving
qdiscs. Instead of returning NULL when the current active class returns
NULL (i.e. its qdisc is non-work-conserving), we iterate over each
active classes once until we find one ready to dequeue. This allows not
to delay the following queues if the first one is slower.
Of course, the complexity is not O(1) any more, as we potentially
iterate once over each active class. But this will only happen if the
inner qdiscs are non-work-conserving, a use-case that was previously not
working correctly anyway.
The documentation of tc-drr(8) will have to be updated to reflect this
change, explaning that non-work-conserving qdiscs are supported but that
the complexity is not O(1) in this case.
This is a proof of concept, it may most likely be optimized by using
list_bulk_move_tail() only once we find a class ready to dequeue instead
of calling list_move_tail() on each iteration.
This is a Request For Comments, as I would like to know if other people
would be interested by such a change and if the implementation make at
least a bit of sense to anyone else than me.
Signed-off-by: Nicolas Peugnet <nicolas@...b1.fr>
---
net/sched/sch_drr.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c
index 9b6d79bd8737..8078bdf88150 100644
--- a/net/sched/sch_drr.c
+++ b/net/sched/sch_drr.c
@@ -373,6 +373,7 @@ static int drr_enqueue(struct sk_buff *skb, struct Qdisc *sch,
static struct sk_buff *drr_dequeue(struct Qdisc *sch)
{
struct drr_sched *q = qdisc_priv(sch);
+ struct list_head *first;
struct drr_class *cl;
struct sk_buff *skb;
unsigned int len;
@@ -380,11 +381,15 @@ static struct sk_buff *drr_dequeue(struct Qdisc *sch)
if (list_empty(&q->active))
goto out;
while (1) {
- cl = list_first_entry(&q->active, struct drr_class, alist);
- skb = cl->qdisc->ops->peek(cl->qdisc);
- if (skb == NULL) {
- qdisc_warn_nonwc(__func__, cl->qdisc);
- goto out;
+ first = q->active.next;
+ while (1) {
+ cl = list_first_entry(&q->active, struct drr_class, alist);
+ skb = cl->qdisc->ops->peek(cl->qdisc);
+ if (skb != NULL)
+ break;
+ list_move_tail(&cl->alist, &q->active);
+ if (q->active.next == first)
+ goto out;
}
len = qdisc_pkt_len(skb);
--
2.39.5
Powered by blists - more mailing lists