[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <n7wl3mZsMuu8WFLvQKWdaXXRLgQDedzbO2rGqlTQgmXtY_fWGX6G_H_R9O6-T8bN7viUWnk0GL-vyDryy_hbEcIoenjw56YNQ5uypXvIgNU=@willsroot.io>
Date: Sat, 12 Jul 2025 00:08:32 +0000
From: William Liu <will@...lsroot.io>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>, Jiri Pirko <jiri@...nulli.us>, Jamal Hadi Salim <jhs@...atatu.com>, Savy <savy@...t3mfailure.io>, Paolo Abeni <pabeni@...hat.com>
Subject: Re: [BUG] Inconsistency between qlen and backlog in hhf, fq, fq_codel, and fq_pie causing WARNING in qdisc_tree_reduce_backlog
On Thursday, July 10th, 2025 at 10:40 PM, Cong Wang <xiyou.wangcong@...il.com> wrote:
>
>
> On Tue, Jul 08, 2025 at 08:21:31PM +0000, William Liu wrote:
>
> > On Monday, July 7th, 2025 at 6:51 PM, Cong Wang xiyou.wangcong@...il.com wrote:
> >
> > > On Sun, Jul 06, 2025 at 01:43:08PM +0000, William Liu wrote:
> > >
> > > > On Saturday, July 5th, 2025 at 1:04 AM, Cong Wang xiyou.wangcong@...il.com wrote:
> > > >
> > > > > On Wed, Jul 02, 2025 at 08:39:45PM +0000, William Liu wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > We write to report a bug in qlen and backlog consistency affecting hhf, fq, fq_codel, and fq_pie when acting as a child of tbf. The cause of this bug was introduced by the following fix last month designed to address a null dereference bug caused by gso segmentation and a temporary inconsistency in queue state when tbf peeks at its child while running out of tokens during tbf_dequeue [1]. We actually reported that bug but did not realize the subtle problem in the fix until now. We are aware of bugs with similar symptoms reported by Mingi [3] and Lion [4], but those are of a different root cause (at least what we can see of Mingi's report).
> > > > >
> > > > > Thanks for your report.
> > > > >
> > > > > > This works on the upstream kernel, and we have the following reproducer.
> > > > > >
> > > > > > ./tc qdisc del dev lo root
> > > > > > ./tc qdisc add dev lo root handle 1: tbf rate 8bit burst 100b latency 1ms || echo TBF
> > > > > > ./tc qdisc add dev lo handle 3: parent 1:1 hhf limit 1000 || echo HH
> > > > > > ping -I lo -f -c1 -s32 -W0.001 127.0.0.1 2>&1 >/dev/null
> > > > > > ./tc qdisc change dev lo handle 3: parent 1:1 hhf limit 0 || echo HH
> > > > > > ./tc qdisc replace dev lo handle 2: parent 1:1 sfq || echo SFQ
> > > > > >
> > > > > > Note that a patched version of tc that supports 0 limits must be built. The symptom of the bug arises in the WARN_ON_ONCE check in qdisc_tree_reduce_backlog [2], where n is 0. You can replace hhf with fq, fq_codel, and fq_pie to trigger warnings as well, though the success rate may vary.
> > > > > >
> > > > > > The root cause comes from the newly introduced function qdisc_dequeue_internal, which the change handler will trigger in the affected qdiscs [5]. When dequeuing from a non empty gso in this peek function, only qlen is decremented, and backlog is not considered. The gso insertion is triggered by qdisc_peek_dequeued, which tbf calls for these qdiscs when they are its child.
> > > > > >
> > > > > > When replacing the qdisc, tbf_graft triggers, and qdisc_purge_queue triggers qdisc_tree_reduce_backlog with the inconsistent values, which one can observe by adding printk to the passed qlen backlog values.
> > > > >
> > > > > If I understand you correctly, the problem is the inconsistent behavior
> > > > > between qdisc_purge_queue() and qdisc_dequeue_internal()? And it is
> > > > > because the former does not take care of ->gso_skb?
> > > >
> > > > No, there are 2 points of inconsistent behavior.
> > > >
> > > > 1. qdisc_dequeue_internal and qdisc_peek_dequeued. In qdisc_peek_dequeued, when a skb comes from the dequeue handler, it gets added to the gso_skb with qlen and backlog increased. In qdisc_dequeue_internal, only qlen is decreased when removing from gso.
> > >
> > > Yes, because qlen is handled by qdisc_dequeue_internal()'s callers to
> > > control their loop of sch->limit.
> >
> > This makes sense. However, if backlog is not adjusted in that helper, then they would go out of sync. qdisc_tree_reduce_backlog only adjusts counters for parent/ancestral qdiscs.
>
>
> Oh, you mean some callers miss qdisc_qstats_backlog_dec()? If you can
> confirm this is an issue and adding it could solve it, please go ahead
> to send out a patch. It makes sense to me so far, at least for aligning
> with other callers.
Ok sounds good, I will take a stab at it then.
>
> > This should help elucidate the problem:
> > export TARGET=hhf
> > ./tc qdisc del dev lo root
> > ./tc qdisc add dev lo root handle 1: tbf rate 8bit burst 100b latency 1ms || echo TBF
> > ./tc qdisc replace dev lo handle 3: parent 1:1 $TARGET limit 1000 || echo HH
> > echo ''; echo 'add child'; tc -s -d qdisc show dev lo
> > ping -I lo -f -c1 -s32 -W0.001 127.0.0.1 2>&1 >/dev/null
> > echo ''; echo 'after ping'; tc -s -d qdisc show dev lo
> > ./tc qdisc change dev lo handle 3: parent 1:1 $TARGET limit 0 || echo HH
> > echo ''; echo 'after limit drop'; tc -s -d qdisc show dev lo
> > ./tc qdisc replace dev lo handle 2: parent 1:1 sfq || echo SFQ
> > echo ''; echo 'post graft'; tc -s -d qdisc show dev lo
> >
> > Perhaps the original WARNING in qdisc_tree_reduce_backlog (which Lion's patch removed) regarding an inconsistent backlog was not a real problem then? But this backlog adjustment can be accounted for pretty easily I think. Let me know if I can help with this.
>
>
> I think Lion's patch fixes a completely different issue? At least I
> don't immediately connect it with this sch->limit issue.
Lion's patch removed the WARN_ONCE that would show this issue occurring.
> > On another note, asides from hhf (because the backlog value to qdisc_tree_reduce_backlog is 0 due to its different limit change loop), the other qdiscs cause an underflow in the backlog of tbf by the final graft.
>
>
> Does adding qdisc_qstats_backlog_dec() solve this too?
>
> Thanks.
I think it should. Will test sometime soon.
Best,
Will
Powered by blists - more mailing lists