linux-kernel - Re: [PATCH] mm/page-writeback: Consolidate wb_thresh bumping logic into __wb_calc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0e5dc5f1-c2c2-4893-902b-4677c21a38c0@roeck-us.net>
Date: Wed, 15 Jan 2025 08:41:43 -0800
From: Guenter Roeck <linux@...ck-us.net>
To: Jan Kara <jack@...e.cz>
Cc: Jim Zhao <jimzhao.ai@...il.com>, akpm@...ux-foundation.org,
 willy@...radead.org, linux-fsdevel@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] mm/page-writeback: Consolidate wb_thresh bumping logic
 into __wb_calc_thresh

On 1/15/25 08:07, Jan Kara wrote:
> On Tue 14-01-25 07:01:08, Guenter Roeck wrote:
>> On 1/14/25 05:19, Jan Kara wrote:
>>> On Mon 13-01-25 15:05:25, Guenter Roeck wrote:
>>>> On Thu, Nov 21, 2024 at 06:05:39PM +0800, Jim Zhao wrote:
>>>>> Address the feedback from "mm/page-writeback: raise wb_thresh to prevent
>>>>> write blocking with strictlimit"(39ac99852fca98ca44d52716d792dfaf24981f53).
>>>>> The wb_thresh bumping logic is scattered across wb_position_ratio,
>>>>> __wb_calc_thresh, and wb_update_dirty_ratelimit. For consistency,
>>>>> consolidate all wb_thresh bumping logic into __wb_calc_thresh.
>>>>>
>>>>> Reviewed-by: Jan Kara <jack@...e.cz>
>>>>> Signed-off-by: Jim Zhao <jimzhao.ai@...il.com>
>>>>
>>>> This patch triggers a boot failure with one of my 'sheb' boot tests.
>>>> It is seen when trying to boot from flash (mtd). The log says
>>>>
>>>> ...
>>>> Starting network: 8139cp 0000:00:02.0 eth0: link down
>>>> udhcpc: started, v1.33.0
>>>> EXT2-fs (mtdblock3): error: ext2_check_folio: bad entry in directory #363: : directory entry across blocks - offset=0, inode=27393, rec_len=3072, name_len=2
>>>> udhcpc: sending discover
>>>> udhcpc: sending discover
>>>> udhcpc: sending discover
>>>> EXT2-fs (mtdblock3): error: ext2_check_folio: bad entry in directory #363: : directory entry across blocks - offset=0, inode=27393, rec_len=3072, name_len=2
>>>
>>> Thanks for report! Uh, I have to say I'm very confused by this. It is clear
>>> than when ext2 detects the directory corruption (we fail checking directory
>>> inode 363 which is likely /etc/init.d/), the boot fails in interesting
>>> ways. What is unclear is how the commit can possibly cause ext2 directory
>>> corruption.  If you didn't verify reverting the commit fixes the issue, I'd
>>> be suspecting bad bisection but that obviously isn't the case :-)
>>>
>>> Ext2 is storing directory data in the page cache so at least it uses the
>>> subsystem which the patch impacts but how writeback throttling can cause
>>> ext2 directory corruption is beyond me. BTW, do you recreate the root
>>> filesystem before each boot? How exactly?
>>
>> I use pre-built root file systems. For sheb, they are at
>> https://github.com/groeck/linux-build-test/tree/master/rootfs/sheb
> 
> Thanks. So the problematic directory is /usr/share/udhcpc/ where we
> read apparently bogus metadata at the beginning of that directory.
> 
>> I don't think this is related to ext2 itself. Booting an ext2 image from
>> ata/ide drive works.
> 
> Interesting this is specific to mtd. I'll read the patch carefully again if
> something rings a bell.
> 

Interesting. Is there some endianness issue, by any chance ? I only see the problem
with sheb (big endian), not with sh (little endian). I'd suspect that it is an
emulation bug, but it is odd that the problem did not show up before.

Guenter