[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210825113016.GB14620@quack2.suse.cz>
Date: Wed, 25 Aug 2021 13:30:16 +0200
From: Jan Kara <jack@...e.cz>
To: Theodore Ts'o <tytso@....edu>
Cc: Jan Kara <jack@...e.cz>, linux-ext4@...r.kernel.org
Subject: Re: [PATCH 0/5 v7] ext4: Speedup orphan file handling
On Tue 24-08-21 13:14:09, Theodore Ts'o wrote:
> I've been running some tests exercising the orphan_file code, and
> there are a number of failures:
>
> ext4/orphan_file: 512 tests, 3 failures, 25 skipped, 7325 seconds
> Failures: ext4/044 generic/475 generic/643
> ext4/orphan_file_1k: 524 tests, 6 failures, 37 skipped, 8361 seconds
> Failures: ext4/033 ext4/044 ext4/045 generic/273 generic/476 generic/643
>
> generic/643 is the iomap swap failure, and can be ignored.
> generic/475 is a pre-existing test flake that involves simulated disk
> failures, which we can also ignore in the context or orphan_file.
>
> However, ext4/044 is one that looks... interesting:
>
> root@...-xfstests:~# e2fsck -fn /dev/vdc
> e2fsck 1.46.4-orphan-file (22-Aug-2021)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> Orphan file (inode 12) block 0 is not clean.
> Clear? no
>
> Failed to initialize orphan file.
> Recreate? no
>
> This is highly reproducible, and involves using a file system config
> that is probably a little unusual:
>
> Filesystem features: has_journal ext_attr resize_inode dir_index orphan_file filetype sparse_super large_file
>
> (This was created using "mke2fs -t ext3 -O orphan_file".)
Interesting. I don't see how orphan handling code gets used at all for this
test. Hrm. Actually it seems to be a bug in the tools themselves because
just "mke2fs -t ext3 -O orphan_file" and "e2fsck -f" reproduces exactly
this failure. It seems that when I was adding physical block number to
orphan file block checksum, I've broken e2fsck for the situation when
metadata_csum is disabled. I've fixed the bug now (relative diff attached,
I can resend the full series once the other bugs are dealt with as well).
> The orphan_file_1k failures seem to involve running out of space in
> the orphan_file, and the fallback to using the old fashioned orphan
> list seems to return ENOSPC? For example, from ext4/045:
>
> +mkdir: No space left on device
> +Failed to create directories - 19679
>
> ext4/045 creates a lot of directories when calls mkdir (ext4/045 tests
> creating more than 65000 subdirectories in a directory), and so this
> seems to be triggering a failure?
Strange. I don't see how ext4/045 load could run out of space in the orphan
file (and in fact I did test that the fallback when we run out of space in
the orphan file works correctly). Anyway, I'll look into it. Thanks for the
reports!
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
View attachment "e2fsck-fixup.patch" of type "text/x-patch" (1206 bytes)
Powered by blists - more mailing lists