[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aBfEk6QUI//BIyZC@pop-os.localdomain>
Date: Sun, 4 May 2025 13:18:04 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Victor Nogueira <victor@...atatu.com>
Cc: netdev@...r.kernel.org, jhs@...atatu.com, jiri@...nulli.us,
alan@...ie.me.uk
Subject: Re: [Patch net 1/2] sch_htb: make htb_deactivate() idempotent
On Wed, Apr 30, 2025 at 12:19:03PM -0300, Victor Nogueira wrote:
> On 4/28/25 20:29, Cong Wang wrote:
> > Alan reported a NULL pointer dereference in htb_next_rb_node()
> > after we made htb_qlen_notify() idempotent.
> >
> > It turns out in the following case it introduced some regression:
> >
> > htb_dequeue_tree():
> > |-> fq_codel_dequeue()
> > |-> qdisc_tree_reduce_backlog()
> > |-> htb_qlen_notify()
> > |-> htb_deactivate()
> > |-> htb_next_rb_node()
> > |-> htb_deactivate()
> >
> > For htb_next_rb_node(), after calling the 1st htb_deactivate(), the
> > clprio[prio]->ptr could be already set to NULL, which means
> > htb_next_rb_node() is vulnerable here.
>
> If I'm not missing something, the issue seems to be that
> fq_codel_dequeue or codel_qdisc_dequeue may call qdisc_tree_reduce_backlog
> with sch->q.qlen == 0 after commit 342debc12183. This will cause
> htb_qlen_notify to be called which will deactivate before we
> call htb_next_rb_node further down in htb_dequeue_tree (as you
> said above).
>
> If that's so, couldn't we instead of doing:
>
> > @@ -348,7 +348,8 @@ static void htb_add_to_wait_tree(struct htb_sched *q,
> > */
> > static inline void htb_next_rb_node(struct rb_node **n)
> > {
> > - *n = rb_next(*n);
> > + if (*n)
> > + *n = rb_next(*n);
> > }
>
> do something like:
>
> @@ -921,7 +921,9 @@ static struct sk_buff *htb_dequeue_tree(struct htb_sched
> *q, const int prio,
> cl->leaf.deficit[level] -= qdisc_pkt_len(skb);
> if (cl->leaf.deficit[level] < 0) {
> cl->leaf.deficit[level] += cl->quantum;
> - htb_next_rb_node(level ?
> &cl->parent->inner.clprio[prio].ptr :
> + /* Account for (fq_)codel child deactivating after
> dequeue */
> + if (likely(cl->prio_activity))
> + htb_next_rb_node(level ?
It reads odd to me to check cl->prio_activity before htb_next_rb_node(),
and I don't see any existing pattern of using this.
My patch pretty much follows all the existing patterns of checking
either cl->prio_activity or cl->leaf.q->q.qlen. So, although it looks
bigger from diffstat, it is safer from this point of view.
Thanks!
Powered by blists - more mailing lists