lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 25 Aug 2023 10:08:17 +0800
From:   Baokun Li <libaokun1@...wei.com>
To:     Jan Kara <jack@...e.cz>
CC:     <linux-ext4@...r.kernel.org>, <tytso@....edu>,
        <adilger.kernel@...ger.ca>, <darrick.wong@...cle.com>,
        <yi.zhang@...wei.com>, <yangerkun@...wei.com>,
        <yukuai3@...wei.com>, Baokun Li <libaokun1@...wei.com>
Subject: Re: [PATCH] e2fsck: delay quotas loading in release_orphan_inodes()

On 2023/8/25 3:37, Jan Kara wrote:
> On Thu 24-08-23 20:56:14, Baokun Li wrote:
>> On 2023/8/24 18:08, Jan Kara wrote:
>>> On Thu 24-08-23 10:27:46, Baokun Li wrote:
>>>> Hello, Jan!
>>>>
>>>> On 2023/8/24 1:05, Jan Kara wrote:
>>>>> On Thu 17-08-23 16:18:28, Baokun Li wrote:
>>>>>> After 7d79b40b ("e2fsck: adjust quota counters when clearing orphaned
>>>>>> inodes"), we load all the quotas before we process the orphaned inodes,
>>>>>> and when we load the quotas, we check the checsum of the bbitmap for each
>>>>>> group. If one of the bbitmap checksums is wrong, the following error will
>>>>>> be reported:
>>>>>>
>>>>>> “Error initializing quota context in support library:
>>>>>>     Block bitmap checksum does not match bitmap”
>>>>>>
>>>>>> But loading quotas comes before checking the current superblock for the
>>>>>> EXT2_ERROR_FS flag, which makes it impossible to use e2fsck to repair any
>>>>>> image that contains orphan inodes and has the wrong bbitmap checksum.
>>>>>> So delaying quota loading until after the EXT2_ERROR_FS judgment avoids
>>>>>> the above problem.
>>>>>>
>>>>>> Signed-off-by: Baokun Li <libaokun1@...wei.com>
>>>>> This certainly looks better but I wonder if there still isn't a problem if
>>>>> the bitmap checksums are wrong but EXT2_ERROR_FS is not set. Shouldn't we
>>>>> rather move the initialization of the quota files after the call to
>>>>> e2fsck_read_bitmaps()?
>>>>>
>>>>> 								Honza
>>>> When the bitmap checksums are wrong but EXT2_ERROR_FS is not set, we must
>>>> have lost some data (error flag or group descriptor or bitmap), so there
>>>> is something wrong with the kernel at this time, so I don't think we
>>>> should fix the image directly, but rather let the user realize that
>>>> something is wrong with the filesystem logic.
>>> I agree it means there is a problem somewhere (the storage, the kernel, or
>>> similar). But just ignoring bitmap checksums in release_orphan_inodes() is
>>> exactly how e2fsck behaves on filesystems without quota feature so I see no
>>> reason for quota feature to change that because the inconsistency has
>>> nothing to do with quotas...
>>>
>>>> Moreover, if we don't care how this happened, but just want to fix the
>>>> image, we only need to run "e2fsck -a" twice. After merging in the
>>>> current patch, we always empty the orphan list before loading the quotas,
>>>> and EXT2_ERROR_FS is set when loading the quotas fails, so this will be
>>>> fixed the second time you run e2fsck. It will not happen that every
>>>> e2fsck will fail like it did before.
>>> I see, you're right so it isn't as bad as I originally thought but still my
>>> argument above holds - IMO e2fsck should treat wrong bitmap checksums the
>>> same way with and without the quota feature.
>>>
>>> 								Honza
>> The original flow that went wrong here is as follows:
>> e2fsck
>>   e2fsck_run_ext3_journal
>>   check_super_block
>>    release_orphan_inodes
>>     e2fsck_read_all_quotas
>>      quota_read_all_dquots
>>       quota_file_open
>>        ext2fs_read_bitmaps
>>         ext2fs_rw_bitmaps
>>          read_bitmaps_range
>>           read_bitmaps_range_start
>>            ext2fs_block_bitmap_csum_verify
>>             !!! error
>>   e2fsck_run
>>
>> Yes, the inconsistency has nothing to do with quota, but quota is loaded
>> here to keep track of space changes during the normal processing of
>> orphan list. If quota was not loaded, we would not have read and check
>> bitmaps until Pass5, and we had already done a lot of checking and
>> tweaking of inodes, blocks, and dirs before Pass5, and the bitmaps
>> inconsistency may have been fixed during that time.
> This is not true. release_orphan_inodes() calls e2fsck_read_bitmaps() which
> loads all the bitmaps while ignoring checksum failures. This is needed so
> that blocks released during orphan cleanup are properly tracked as free.
> All I want to do is to move the call to e2fsck_read_all_quotas() a bit
> further than you moved it to a place after the e2fsck_read_bitmaps()
> call...
>
> 								Honza
Yes, e2fsck_read_bitmaps() ignores checksum errors for reading bitmaps, 
which
prevents us from exiting e2fsck due to checksum error in 
release_orphan_inodes(),
but in the case of the previously mentioned checksum error but EXT2_ERROR_FS
is not set, when we execute "e2fsck -a", since checksum is ignored, the 
filesystem
is considered clean, so it exits e2fsck without performing a force 
check, but the
error is still there.

                                            Baokun

-- 
With Best Regards,
Baokun Li
.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ