linux-ext4 - Re: [BUG] WARNING in mb_avg_fragment_size

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <aCxuBdzvhK8lfmAQ@li-dc0c254c-257c-11b2-a85c-98b6c1322444.ibm.com>
Date: Tue, 20 May 2025 17:26:53 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: Guoyu Yin <y04609127@...il.com>
Cc: "Theodore Ts'o" <tytso@....edu>, adilger.kernel@...ger.ca,
        linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [BUG] WARNING in mb_avg_fragment_size_order

On Tue, May 20, 2025 at 01:12:11PM +0800, Guoyu Yin wrote:
> Hi,
> 
> This crash can be triggered by repeatedly performing specific file
> operations, causing a kernel warning in the Ext4 multi-block allocator
> (mballoc) module. The issue arises in mb_avg_fragment_size_order due
> to an invalid len parameter leading to an out-of-bound order, thus
> triggering the WARNING.
> 
> Root Cause:
> 1. Code Path: In ext4_mb_choose_next_group_best_avail(),
> ac_g_ex.fe_len might be incorrectly calculated as an excessively large
> value (e.g., via roundup).
> 2. Invalid Parameter: When ac_g_ex.fe_len is too large, order =
> fls(len) - 2 in mb_avg_fragment_size_order() exceeds
> MB_NUM_ORDERS(sb), triggering WARN_ON_ONCE(order > MB_NUM_ORDERS(sb)).
> 
> Code Locations:
> fs/ext4/mballoc.c:834.
> 
> Proposed Fix:
> 1. Add validity checks for ac_g_ex.fe_len in
> ext4_mb_choose_next_group_best_avail() to ensure it does not exceed 1
> << (MB_NUM_ORDERS(sb) + 2).
> 2. Enforce strict input validation for len in
> mb_avg_fragment_size_order() to reject invalid values.
> 
> This can be reproduced on:
> HEAD commit:
> 
> fac04efc5c793dccbd07e2d59af9f90b7fc0dca4
> 
> report: https://pastebin.com/raw/W5ejqsNx
> 
> console output : https://pastebin.com/raw/U9qUGBhY
> 
> kernel config: https://pastebin.com/raw/zrj9jd1V
> 
> C reproducer : https://pastebin.com/raw/TCwWzfaH

Hi Guoyu,

A quick run of the reproducer is not able to hit this issue for me. I'll
try once with the config you privided. 

Also, it's strange that we hit this since the ext4_mb_normalize_request
takes care of making sure the goal doesn't cross the maximum blocks
buddy can allocate in one shot (ie 1 << blkbits + 1), which should
in-turn ensure that the goal length order is never greater than
MB_NUM_ORDER.

I'll try to see if I can hit it. In the meantime, if you are easily able
to replicate it, can you provide the following information:

1. I see you are testing on kernel v6.13-rc2 which is slightly old now.
Can you check if you are able to hit it on latest mainline kenrel
(v6.15-rc*)

2. Also, if possible can you please share the output after adding the following
tracing, example:

  sudo trace-cmd record -e "ext4:ext4_mballoc_alloc" ./reproducer

and then

  sudo trace-cmd report -i trace.dat 

to view the output in text format. (You can also use perf probe -e "ext4:ext4_mballoc_alloc")
to collect this.

Regards,
ojaswin

> 
> Best regards,
> Guoyu