lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c8110ec8-a868-b531-a230-f4b645f5ac73@applied-asynchrony.com>
Date: Mon, 28 Apr 2025 14:28:14 +0200
From: Holger Hoffstätte <holger@...lied-asynchrony.com>
To: Greg KH <gregkh@...uxfoundation.org>
Cc: "Alan J. Wylie" <alan@...ie.me.uk>, Jamal Hadi Salim <jhs@...atatu.com>,
 regressions@...ts.linux.dev, Cong Wang <xiyou.wangcong@...il.com>,
 Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org, Octavian Purdila <tavip@...gle.com>,
 Toke Høiland-Jørgensen <toke@...hat.com>,
 stable@...r.kernel.org
Subject: Re: [REGRESSION] 6.14.3 panic - kernel NULL pointer dereference in
 htb_dequeue

On 2025-04-28 13:45, Greg KH wrote:
> On Tue, Apr 22, 2025 at 07:20:24PM +0200, Holger Hoffstätte wrote:
>> (cc: Greg KH)
>>
>> On 2025-04-22 18:51, Alan J. Wylie wrote:
>>> On Mon, 21 Apr 2025 21:09:27 +0100
>>> "Alan J. Wylie" <alan@...ie.me.uk> wrote:
>>>
>>>> On Mon, 21 Apr 2025 21:47:44 +0200
>>>> Holger Hoffstätte <holger@...lied-asynchrony.com> wrote:
>>>>
>>>>>> I'm afraid that didn't help. Same panic.
>>>>>
>>>>> Bummer :-(
>>>>>
>>>>> Might be something else missing then - so for now the only other
>>>>> thing I'd suggest is to revert the removal of the qlen check in
>>>>> fq_codel.
>>>>
>>>> Like this?
>>>>
>>>> $ git diff  sch_fq_codel.c
>>>> diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
>>>> index 6c9029f71e88..4fdf317b82ec 100644
>>>> --- a/net/sched/sch_fq_codel.c
>>>> +++ b/net/sched/sch_fq_codel.c
>>>> @@ -316,7 +316,7 @@ static struct sk_buff *fq_codel_dequeue(struct
>>>> Qdisc *sch) qdisc_bstats_update(sch, skb);
>>>>           flow->deficit -= qdisc_pkt_len(skb);
>>>> -       if (q->cstats.drop_count) {
>>>> +       if (q->cstats.drop_count && sch->q.qlen) {
>>>>                   qdisc_tree_reduce_backlog(sch, q->cstats.drop_count,
>>>>                                             q->cstats.drop_len);
>>>>                   q->cstats.drop_count = 0;
>>>> $
>>>>
>>>
>>> It's been about 21 hours and no crash yet. I had an excellent day down
>>> a cave, so there's not been as much Internet traffic as usual, but
>>> there's a good chance the above patch as at least worked around, if not
>>> fixed the issue.
>>
>> Thought so .. \o/
>>
>> I guess now the question is what to do about it. IIUC the fix series [1]
>> addressed some kind of UAF problem, but obviously was not applied
>> correctly or is missing follow-ups. It's also a bit mysterious why
>> adding the HTB patch didn't work.
>>
>> Maybe Cong Wang can advise what to do here?
>>
>> So unless someone else has any ideas: Greg, please revert:
>>
>> 6.14.y/a57fe60ef4cf96bfbb6b58397ec28bdb5a5c6b31
>> ("codel: remove sch->q.qlen check before qdisc_tree_reduce_backlog()")
>>
>> and probably from 6.12 as well.
> 
> Why only those 2 branches?  What about all others, and mainline?

All branches that received upstream 342debc12183 aka
"codel: remove sch->q.qlen check before qdisc_tree_reduce_backlog()"
should be reverted for now. I previously didn't check them all.

It was part of a series, and the rest of the series is missing (not
picked by autosel). Luckily that does not matter because mainline
is apparently also buggy, so applying the rest of the series
would not help. This is currently still being debugged/worked on.

Just reverting this one commit provably fixes the crash for people
using -stable in e.g. gateway/router setups, with a htb->fq_codel chain.
The crash does not affect people using fq_codel as root qdisc.

Reverting is not the greatest solution since it reintroduces a potential
UAF, which has been there since fq_codel was first committed.
IIUC this potential UAF can only be induced with admin privileges.

I don't know how to explain this any better; all I did was try to
help Alan. :-(

-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ