[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <eecd9a29-14f9-432d-a3cf-5215313df9f0@mojatatu.com>
Date: Wed, 30 Apr 2025 12:19:03 -0300
From: Victor Nogueira <victor@...atatu.com>
To: Cong Wang <xiyou.wangcong@...il.com>, netdev@...r.kernel.org
Cc: jhs@...atatu.com, jiri@...nulli.us, alan@...ie.me.uk
Subject: Re: [Patch net 1/2] sch_htb: make htb_deactivate() idempotent
On 4/28/25 20:29, Cong Wang wrote:
> Alan reported a NULL pointer dereference in htb_next_rb_node()
> after we made htb_qlen_notify() idempotent.
>
> It turns out in the following case it introduced some regression:
>
> htb_dequeue_tree():
> |-> fq_codel_dequeue()
> |-> qdisc_tree_reduce_backlog()
> |-> htb_qlen_notify()
> |-> htb_deactivate()
> |-> htb_next_rb_node()
> |-> htb_deactivate()
>
> For htb_next_rb_node(), after calling the 1st htb_deactivate(), the
> clprio[prio]->ptr could be already set to NULL, which means
> htb_next_rb_node() is vulnerable here.
If I'm not missing something, the issue seems to be that
fq_codel_dequeue or codel_qdisc_dequeue may call qdisc_tree_reduce_backlog
with sch->q.qlen == 0 after commit 342debc12183. This will cause
htb_qlen_notify to be called which will deactivate before we
call htb_next_rb_node further down in htb_dequeue_tree (as you
said above).
If that's so, couldn't we instead of doing:
> @@ -348,7 +348,8 @@ static void htb_add_to_wait_tree(struct htb_sched *q,
> */
> static inline void htb_next_rb_node(struct rb_node **n)
> {
> - *n = rb_next(*n);
> + if (*n)
> + *n = rb_next(*n);
> }
do something like:
@@ -921,7 +921,9 @@ static struct sk_buff *htb_dequeue_tree(struct
htb_sched *q, const int prio,
cl->leaf.deficit[level] -= qdisc_pkt_len(skb);
if (cl->leaf.deficit[level] < 0) {
cl->leaf.deficit[level] += cl->quantum;
- htb_next_rb_node(level ?
&cl->parent->inner.clprio[prio].ptr :
+ /* Account for (fq_)codel child deactivating
after dequeue */
+ if (likely(cl->prio_activity))
+ htb_next_rb_node(level ?
&cl->parent->inner.clprio[prio].ptr :
&q->hlevel[0].hprio[prio].ptr);
}
/* this used to be after charge_class but this constelation
That way it's clear that the issue is a corner case where the
child qdisc deactivates the parent. Otherwise it seems like
we need to check for (*n) being NULL for every call to
htb_next_rb_node.
cheers,
Victor
Powered by blists - more mailing lists