netdev - Re: [PATCH net v4 1/2] net/sched: Fix backlog accounting in qdisc_dequeue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250811102449.50e5f416@kernel.org>
Date: Mon, 11 Aug 2025 10:24:49 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: William Liu <will@...lsroot.io>
Cc: netdev@...r.kernel.org, jhs@...atatu.com, xiyou.wangcong@...il.com,
 pabeni@...hat.com, jiri@...nulli.us, davem@...emloft.net,
 edumazet@...gle.com, horms@...nel.org, savy@...t3mfailure.io,
 victor@...atatu.com
Subject: Re: [PATCH net v4 1/2] net/sched: Fix backlog accounting in
 qdisc_dequeue_internal

On Mon, 11 Aug 2025 16:52:51 +0000 William Liu wrote:
> > > Can you elaborate on this?
> > > 
> > > I just moved the reset of two cstats fields from the dequeue handler
> > > epilogue to the prologue. Those specific cstats fields are not used
> > > elsewhere so they should be fine,  
> > 
> > 
> > That's the disconnect. AFAICT they are passed to codel_dequeue(),
> > and will be used during normal dequeue, as part of normal active
> > queue management under traffic..
> >   
> 
> Yes, that is the only place those values are used. From my
> understanding, codel_dequeue is only called in fq_codel_dequeue. So
> moving the reset from the dequeue epilogue to the dequeue prologue
> should be fine as the same behavior is kept - the same values should
> always be used by codel_dequeue.
> 
> Is there a case I am not seeing? If so, I can just add additional
> fields to the fq_codel_sched_data, but wanted to avoid doing that for
> this one edge case.

This sort of separation of logic is very error prone in general.
If you're asking for a specific bug that would exist with your 
patch - I believe that two subsequent fq_codel_change() calls,
first one reducing the limit, the other one _not_ reducing (and
therefore never invoking dequeue) will adjust the backlog twice.

As I commented in the previous message - wouldn't counting the
packets we actually dequeue not solve this problem? smth like:

	pkts = 0;
	bytes = 0;
 	while (sch->q.qlen > sch->limit ||
 	       q->memory_usage > q->memory_limit) {
 		struct sk_buff *skb = qdisc_dequeue_internal(sch, false);
 
		pkts++;
		bytes += qdisc_pkt_len(skb);
 		rtnl_kfree_skbs(skb, skb);
 	}
	qdisc_tree_reduce_backlog(sch, pkts, bytes);

? "Conceptually" we are only responsible for adjusting the backlog
for skbs we actually gave to kfree_skb().