netdev - Re: Incomplete fix for recent bug in tc / hfsc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3af4930b-6773-4159-8a7a-e4f6f6ae8109@gmail.com>
Date: Tue, 24 Jun 2025 11:24:20 +0200
From: Lion Ackermann <nnamrec@...il.com>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: netdev@...r.kernel.org, Jamal Hadi Salim <jhs@...atatu.com>,
 Jiri Pirko <jiri@...nulli.us>
Subject: Re: Incomplete fix for recent bug in tc / hfsc

Hi,

On 6/24/25 6:41 AM, Cong Wang wrote:
> On Mon, Jun 23, 2025 at 12:41:08PM +0200, Lion Ackermann wrote:
>> Hello,
>>
>> I noticed the fix for a recent bug in sch_hfsc in the tc subsystem is
>> incomplete:
>>     sch_hfsc: Fix qlen accounting bug when using peek in hfsc_enqueue()
>>     https://lore.kernel.org/all/20250518222038.58538-2-xiyou.wangcong@gmail.com/
>>
>> This patch also included a test which landed:
>>     selftests/tc-testing: Add an HFSC qlen accounting test
>>
>> Basically running the included test case on a sanitizer kernel or with
>> slub_debug=P will directly reveal the UAF:
> 
> Interesting, I have SLUB debugging enabled in my kernel config too:
> 
> CONFIG_SLUB_DEBUG=y
> CONFIG_SLUB_DEBUG_ON=y
> CONFIG_SLUB_RCU_DEBUG=y
> 
> But I didn't catch this bug.
>  

Technically the class deletion step which triggered the sanitizer was not
present in your testcase. The testcase only left the stale pointer which was
never accessed though.

>> To be completely honest I do not quite understand the rationale behind the
>> original patch. The problem is that the backlog corruption propagates to
>> the parent _before_ parent is even expecting any backlog updates.
>> Looking at f.e. DRR: Child is only made active _after_ the enqueue completes.
>> Because HFSC is messing with the backlog before the enqueue completed, 
>> DRR will simply make the class active even though it should have already
>> removed the class from the active list due to qdisc_tree_backlog_flush.
>> This leaves the stale class in the active list and causes the UAF.
>>
>> Looking at other qdiscs the way DRR handles child enqueues seems to resemble
>> the common case. HFSC calling dequeue in the enqueue handler violates
>> expectations. In order to fix this either HFSC has to stop using dequeue or
>> all classful qdiscs have to be updated to catch this corner case where
>> child qlen was zero even though the enqueue succeeded. Alternatively HFSC
>> could signal enqueue failure if it sees child dequeue dropping packets to
>> zero? I am not sure how this all plays out with the re-entrant case of
>> netem though.
> 
> I think this may be the same bug report from Mingi in the security
> mailing list. I will take a deep look after I go back from Open Source
> Summit this week. (But you are still very welcome to work on it by
> yourself, just let me know.)
> 
> Thanks!

> My suggestion is we go back to a proposal i made a few moons back (was
> this in a discussion with you? i dont remember): create a mechanism to
> disallow certain hierarchies of qdiscs based on certain attributes,
> example in this case disallow hfsc from being the ancestor of "qdiscs that may
> drop during peek" (such as netem). Then we can just keep adding more
> "disallowed configs" that will be rejected via netlink. Similar idea
> is being added to netem to disallow double duplication, see:
> https://lore.kernel.org/netdev/20250622190344.446090-1-will@willsroot.io/
> 
> cheers,
> jamal

I vaguely remember Jamal's proposal from a while back, and I believe there was 
some example code for this approach already? 
Since there is another report you have a better overview, so it is probably 
best you look at it first. In the meantime I can think about the solution a 
bit more and possibly draft something if you wish.

Thanks,
Lion