[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <96927022-92d9-46be-8af4-225cab01b006@redhat.com>
Date: Tue, 22 Jul 2025 15:32:04 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: William Liu <will@...lsroot.io>, netdev@...r.kernel.org
Cc: jhs@...atatu.com, xiyou.wangcong@...il.com, kuba@...nel.org,
jiri@...nulli.us, davem@...emloft.net, edumazet@...gle.com,
horms@...nel.org, savy@...t3mfailure.io
Subject: Re: [PATCH net 1/2] net/sched: Fix backlog accounting in
qdisc_dequeue_internal
On 7/19/25 8:08 PM, William Liu wrote:
> This issue applies for the following qdiscs: hhf, fq, fq_codel, and
> fq_pie, and occurs in their change handlers when adjusting to the new
> limit. The problems are the following in the values passed to the
> subsequent qdisc_tree_reduce_backlog call:
>
> 1. When the tbf parent runs out of tokens, skbs of these qdiscs will
> be placed in gso_skb. Their peek handlers are qdisc_peek_dequeued,
> which accounts for both qlen and backlog. However, in the case of
> qdisc_dequeue_internal, ONLY qlen is accounted for when pulling
> from gso_skb. This means that these qdiscs are missing a
> qdisc_qstats_backlog_dec when dropping packets to satisfy the
> new limit in their change handlers.
>
> One can observe this issue with the following (with tc patched to
> support a limit of 0):
>
> export TARGET=fq
> tc qdisc del dev lo root
> tc qdisc add dev lo root handle 1: tbf rate 8bit burst 100b latency 1ms
> tc qdisc replace dev lo handle 3: parent 1:1 $TARGET limit 1000
> echo ''; echo 'add child'; tc -s -d qdisc show dev lo
> ping -I lo -f -c2 -s32 -W0.001 127.0.0.1 2>&1 >/dev/null
> echo ''; echo 'after ping'; tc -s -d qdisc show dev lo
> tc qdisc change dev lo handle 3: parent 1:1 $TARGET limit 0
> echo ''; echo 'after limit drop'; tc -s -d qdisc show dev lo
> tc qdisc replace dev lo handle 2: parent 1:1 sfq
> echo ''; echo 'post graft'; tc -s -d qdisc show dev lo
>
> The second to last show command shows 0 packets but a positive
> number (74) of backlog bytes. The problem becomes clearer in the
> last show command, where qdisc_purge_queue triggers
> qdisc_tree_reduce_backlog with the positive backlog and causes an
> underflow in the tbf parent's backlog (4096 Mb instead of 0).
>
> 2. fq_codel_change is also wrong in the non gso_skb case. It tracks
> the amount to drop after the limit adjustment loop through
> cstats.drop_count and cstats.drop_len, but these are also updated
> in fq_codel_dequeue, and reset everytime if non-zero in that
> function after a call to qdisc_tree_reduce_backlog.
> If the drop path ever occurs in fq_codel_dequeue and
> qdisc_dequeue_internal takes the non gso_skb path, then we would
> reduce the backlog by an extra packet.
>
> To fix these issues, the codepath for all clients of
> qdisc_dequeue_internal has been simplified: codel, pie, hhf, fq,
> fq_pie, and fq_codel. qdisc_dequeue_internal handles the backlog
> adjustments for all cases that do not directly use the dequeue
> handler.
>
> Special care is taken for fq_codel_dequeue to account for the
> qdisc_tree_reduce_backlog call in its dequeue handler. The
> cstats reset is moved from the end to the beginning of
> fq_codel_dequeue, so the change handler can use cstats for
> proper backlog reduction accounting purposes. The drop_len and
> drop_count fields are not used elsewhere so this reordering in
> fq_codel_dequeue is ok.
>
> Fixes: 2d3cbfd6d54a ("net_sched: Flush gso_skb list too during ->change()")
> Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM")
> Fixes: 10239edf86f1 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc")
>
> Signed-off-by: William Liu <will@...lsroot.io>
> Reviewed-by: Savino Dicanosa <savy@...t3mfailure.io>
Please avid black lines in the tag area, i.e. between 'Fixes' and the SoB.
Also I think this could/should be splitted in several patches, one for
each affected qdisc.
> diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
> index 2a0f3a513bfa..f9e6d76a1712 100644
> --- a/net/sched/sch_fq_codel.c
> +++ b/net/sched/sch_fq_codel.c
> @@ -286,6 +286,10 @@ static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch)
> struct fq_codel_flow *flow;
> struct list_head *head;
>
> + /* reset these here, as change needs them for proper accounting*/
> + q->cstats.drop_count = 0;
> + q->cstats.drop_len = 0;
> +
> begin:
> head = &q->new_flows;
> if (list_empty(head)) {
> @@ -319,8 +323,6 @@ static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch)
> if (q->cstats.drop_count) {
Why is this 'if' still needed ? Isn't drop_count always 0 here?
/P
Powered by blists - more mailing lists