[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <EBHeQZeq5AJteszZoHrsiJv6EGOnuByQ-XNejgA9WiqQ8g2jIXowzoGjuJowDcV6xi9xBgyMTwNlS8wN0zUOlRl4Bl2Mv-x883IKCvdySyU=@syst3mfailure.io>
Date: Tue, 29 Apr 2025 13:41:19 +0000
From: Savy <savy@...t3mfailure.io>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: Will <willsroot@...tonmail.com>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, jhs@...atatu.com, jiri@...nulli.us
Subject: Re: [BUG] net/sched: Race Condition and Null Dereference in codel_change, pie_change, fq_pie_change, fq_codel_change, hhf_change
On Monday, April 28th, 2025 at 7:53 PM, Cong Wang <xiyou.wangcong@...il.com> wrote:
>
>
> Excellent analysis!
>
> Do you mind testing the following patch?
>
> Note:
>
> 1) We can't just test NULL, because otherwise we would leak the skb's
> in gso_skb list.
>
> 2) I am totally aware that maybe there are some other cases need the
> same fix, but I want to be conservative here since this will be
> targeting for -stable. It is why I intentionally keep my patch minimum.
>
> Thanks!
>
> --------------->
>
Hi Cong,
Thank you for the reply. We have tested your patch and can confirm that it resolves the issue.
However, regarding point [1], we conducted some tests to verify if there is a skb leak in the gso_skb list,
but the packet remains in the list only for a limited amount of time.
In our POC we set a very low TBF rate, so when the Qdisc runs out of tokens,
it reschedules itself via qdisc watchdog after approximately 45 seconds.
Returning to the example above, here is what happens when the watchdog timer fires:
[ ... ]
Packet 2 is sent:
[ ... ]
tbf_dequeue()
qdisc_peek_dequeued()
skb_peek(&sch->gso_skb) // sch->gso_skb is empty
codel_qdisc_dequeue() // Codel qlen is 1
qdisc_dequeue_head()
// Packet 2 is removed from the queue
// Codel qlen = 0
__skb_queue_head(&sch->gso_skb, skb); // Packet 2 is added to gso_skb list
sch->q.qlen++ // Codel qlen = 1
// TBF runs out of tokens and reschedules itself for later
qdisc_watchdog_schedule_ns()
codel_change() // Patched, (!skb) break;, does not crash
// ... ~45 seconds later the qdisc watchdog timer fires
tbf_dequeue()
qdisc_peek_dequeued()
skb_peek(&sch->gso_skb) // sch->gso_skb is _not_ empty (contains Packet 2)
// TBF now has enough tokens
qdisc_dequeue_peeked()
skb = __skb_dequeue(&sch->gso_skb) // Packet 2 is removed from the gso_skb list
sch->q.qlen-- // Codel qlen = 0
Notice how the gso_skb list is correctly cleaned up when the watchdog timer fires.
We also examined some edge cases, such as when the watchdog is canceled
and there are still packets left in the gso_skb list, and it is always cleaned up:
Qdisc destruction case:
tbf_destroy()
qdisc_put()
__qdisc_destroy()
qdisc_reset()
__skb_queue_purge(&qdisc->gso_skb);
Qdisc reset case:
tbf_reset()
qdisc_reset()
__skb_queue_purge(&qdisc->gso_skb);
Perhaps the skb leak you mentioned occurs in another edge case that we overlooked?
In any case, we believe your patch is technically more correct,
as it makes sense to clean up packets in the gso_skb list first when the limit changes.
Best regards,
Savy
Will
Powered by blists - more mailing lists