linux-kernel - Re: [PATCH 1/1] ext4: fix crash on BUG_ON in ext4_alloc_group

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e6758b35-fb97-4f69-8cdb-2650edea80b5@huawei.com>
Date: Fri, 27 Sep 2024 09:43:13 +0800
From: Baokun Li <libaokun1@...wei.com>
To: Eric Sandeen <sandeen@...deen.net>
CC: Jan Kara <jack@...e.cz>, Alexander Mikhalitsyn
	<aleksandr.mikhalitsyn@...onical.com>, <tytso@....edu>,
	<stable@...r.kernel.org>, Andreas Dilger <adilger.kernel@...ger.ca>,
	Stéphane Graber <stgraber@...raber.org>, Christian Brauner
	<brauner@...nel.org>, <linux-kernel@...r.kernel.org>,
	<linux-fsdevel@...r.kernel.org>, <linux-ext4@...r.kernel.org>, Wesley
 Hershberger <wesley.hershberger@...onical.com>, Yang Erkun
	<yangerkun@...wei.com>, Baokun Li <libaokun1@...wei.com>
Subject: Re: [PATCH 1/1] ext4: fix crash on BUG_ON in ext4_alloc_group_tables

On 2024/9/27 0:04, Eric Sandeen wrote:
> On 9/26/24 3:28 AM, Baokun Li wrote:
>> On 2024/9/25 23:57, Jan Kara wrote:
>>> On Wed 25-09-24 16:33:24, Alexander Mikhalitsyn wrote:
>>>> [   33.882936] EXT4-fs (dm-5): mounted filesystem 8aaf41b2-6ac0-4fa8-b92b-77d10e1d16ca r/w with ordered data mode. Quota mode: none.
>>>> [   33.888365] EXT4-fs (dm-5): resizing filesystem from 7168 to 786432 blocks
>>>> [   33.888740] ------------[ cut here ]------------
>>>> [   33.888742] kernel BUG at fs/ext4/resize.c:324!
>>> Ah, I was staring at this for a while before I understood what's going on
>>> (it would be great to explain this in the changelog BTW).  As far as I
>>> understand commit 665d3e0af4d3 ("ext4: reduce unnecessary memory allocation
>>> in alloc_flex_gd()") can actually make flex_gd->resize_bg larger than
>>> flexbg_size (for example when ogroup = flexbg_size, ngroup = 2*flexbg_size
>>> - 1) which then confuses things. I think that was not really intended and
>>> instead of fixing up ext4_alloc_group_tables() we should really change
>>> the logic in alloc_flex_gd() to make sure flex_gd->resize_bg never exceeds
>>> flexbg size. Baokun?
>>>
>>>                                  Honza
>> Hi Honza,
>>
>> Your analysis is absolutely correct. It's a bug!
>> Thank you for locating this issue！
>> An extra 1 should not be added when calculating resize_bg in alloc_flex_gd().
>>
>>
>> Hi Aleksandr,
>>
>> Could you help test if the following changes work?
>>
> I just got an internal bug report for this as well, and I can also confirm that
> the patch fixes my testcase, thanks! Feel free to add:
>
> Tested-by: Eric Sandeen <sandeen@...hat.com>
>
> I had been trying to debug this and it felt like an off by one but I didn't get
> to a solution in time. ;)
>
> Can you explain what the 2 cases under
>
> /* Avoid allocating large 'groups' array if not needed */
>
> are doing? I *think* the first 'if' is trying not to over-allocate for the last
> batch of block groups that get added during a resize. What is the "else if" case
> doing?
>
> Thanks,
> -Eric
>
The ext4 online resize started out by allocating the memory needed
for an entire flexible block group regardless of the number of block
groups to be added.

This led to a warning being triggered in the mm module when the
userland set a very large flexbg_size (perhaps they wanted to put
all the group metadata in the head of the disk), allocating more
memory than the MAX_ORDER size. Commit 5d1935ac02ca5a
("ext4: avoid online resizing failures due to oversized flex bg") fixes
this problem with resize_bg.

Normally, when online resizing, we may not need to allocate the
memory required for an entire flexible block group. So if o_group
and n_group are in the same flexible block group or if o_group
and n_group are in neighbouring flexible block groups, we
check if less memory can be allocated. This is what was done in
commit 665d3e0af4d3 ("ext4: reduce unnecessary memory
allocation in alloc_flex_gd()").

Regards,
Baokun