lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230824193739.4k6kjsohbggoiu4n@quack3>
Date:   Thu, 24 Aug 2023 21:37:39 +0200
From:   Jan Kara <jack@...e.cz>
To:     Baokun Li <libaokun1@...wei.com>
Cc:     Jan Kara <jack@...e.cz>, linux-ext4@...r.kernel.org, tytso@....edu,
        adilger.kernel@...ger.ca, darrick.wong@...cle.com,
        yi.zhang@...wei.com, yangerkun@...wei.com, yukuai3@...wei.com
Subject: Re: [PATCH] e2fsck: delay quotas loading in release_orphan_inodes()

On Thu 24-08-23 20:56:14, Baokun Li wrote:
> On 2023/8/24 18:08, Jan Kara wrote:
> > On Thu 24-08-23 10:27:46, Baokun Li wrote:
> > > Hello, Jan!
> > > 
> > > On 2023/8/24 1:05, Jan Kara wrote:
> > > > On Thu 17-08-23 16:18:28, Baokun Li wrote:
> > > > > After 7d79b40b ("e2fsck: adjust quota counters when clearing orphaned
> > > > > inodes"), we load all the quotas before we process the orphaned inodes,
> > > > > and when we load the quotas, we check the checsum of the bbitmap for each
> > > > > group. If one of the bbitmap checksums is wrong, the following error will
> > > > > be reported:
> > > > > 
> > > > > “Error initializing quota context in support library:
> > > > >    Block bitmap checksum does not match bitmap”
> > > > > 
> > > > > But loading quotas comes before checking the current superblock for the
> > > > > EXT2_ERROR_FS flag, which makes it impossible to use e2fsck to repair any
> > > > > image that contains orphan inodes and has the wrong bbitmap checksum.
> > > > > So delaying quota loading until after the EXT2_ERROR_FS judgment avoids
> > > > > the above problem.
> > > > > 
> > > > > Signed-off-by: Baokun Li <libaokun1@...wei.com>
> > > > This certainly looks better but I wonder if there still isn't a problem if
> > > > the bitmap checksums are wrong but EXT2_ERROR_FS is not set. Shouldn't we
> > > > rather move the initialization of the quota files after the call to
> > > > e2fsck_read_bitmaps()?
> > > > 
> > > > 								Honza
> > > When the bitmap checksums are wrong but EXT2_ERROR_FS is not set, we must
> > > have lost some data (error flag or group descriptor or bitmap), so there
> > > is something wrong with the kernel at this time, so I don't think we
> > > should fix the image directly, but rather let the user realize that
> > > something is wrong with the filesystem logic.
> > I agree it means there is a problem somewhere (the storage, the kernel, or
> > similar). But just ignoring bitmap checksums in release_orphan_inodes() is
> > exactly how e2fsck behaves on filesystems without quota feature so I see no
> > reason for quota feature to change that because the inconsistency has
> > nothing to do with quotas...
> > 
> > > Moreover, if we don't care how this happened, but just want to fix the
> > > image, we only need to run "e2fsck -a" twice. After merging in the
> > > current patch, we always empty the orphan list before loading the quotas,
> > > and EXT2_ERROR_FS is set when loading the quotas fails, so this will be
> > > fixed the second time you run e2fsck. It will not happen that every
> > > e2fsck will fail like it did before.
> > I see, you're right so it isn't as bad as I originally thought but still my
> > argument above holds - IMO e2fsck should treat wrong bitmap checksums the
> > same way with and without the quota feature.
> > 
> > 								Honza
> The original flow that went wrong here is as follows:
> e2fsck
>  e2fsck_run_ext3_journal
>  check_super_block
>   release_orphan_inodes
>    e2fsck_read_all_quotas
>     quota_read_all_dquots
>      quota_file_open
>       ext2fs_read_bitmaps
>        ext2fs_rw_bitmaps
>         read_bitmaps_range
>          read_bitmaps_range_start
>           ext2fs_block_bitmap_csum_verify
>            !!! error
>  e2fsck_run
> 
> Yes, the inconsistency has nothing to do with quota, but quota is loaded
> here to keep track of space changes during the normal processing of
> orphan list. If quota was not loaded, we would not have read and check
> bitmaps until Pass5, and we had already done a lot of checking and
> tweaking of inodes, blocks, and dirs before Pass5, and the bitmaps
> inconsistency may have been fixed during that time.

This is not true. release_orphan_inodes() calls e2fsck_read_bitmaps() which
loads all the bitmaps while ignoring checksum failures. This is needed so
that blocks released during orphan cleanup are properly tracked as free.
All I want to do is to move the call to e2fsck_read_all_quotas() a bit
further than you moved it to a place after the e2fsck_read_bitmaps()
call...

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ