[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.21.1706160655260.1877@joy.test>
Date: Fri, 16 Jun 2017 07:29:00 -0700 (PDT)
From: Richard Narron <comet.berkeley@...il.com>
To: Al Viro <viro@...IV.linux.org.uk>
cc: Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [git pull] first batch of ufs fixes
On Thu, 15 Jun 2017, Al Viro wrote:
> On Wed, Jun 14, 2017 at 08:11:33AM +0100, Al Viro wrote:
> FWIW, it seems to work here. Said that, *BSD fsck_ffs is not worth much -
> play a bit with redundancy in UFS superblock (starting with fragment
> and block sizes, their ratio, logarithms, bitmasks, etc.) and you can
> screw at least 10.3 into the ground when mounting an image that passes
> their fsck. Sure, anyone who mounts untrusted images is a cretin who
> deserves everything they get, fsck or no fsck, but... no complaints from
> fsck is not a reliable indicator of image being in good condition and
> that's PITA for testing.
>
> Another pile of fun: "reserve ->s_minfree percents of total" logics had
> been broken.
> * using hardwired 5% is wrong - especially for ufs2, where it's
> not even the default
> * ufs_freespace() returns u64; testing for <= 0 is not doing the
> right thing
> * no capability checks before we need them, TYVM...
> * ufs2 needs 64bit uspi->s_dsize (and ->s_size, while we are at it).
> 64bit variants were even calculated - and never used.
> * while we are at it, doing "multiply the total data frags by
> s_minfree and divide by 100" every time we allocate a block is bloody
> dumb - that should be calculated once.
>
> We really need to get the sodding tail unpacking moved up from the place
> where it's buried - turns out that my doubts about that code managing to
> avoid deadlocks had been correct. Long-term we need to move that thing
> to iomap-based ->write_iter() and do unpacking there and in truncate().
> For now I've slapped together something that is easier to backport -
> avoiding ->truncate_mutex when possible and not holding ->s_lock over
> ufs_change_blocknr().
>
> Another bug in the same area: ufs_get_locked_page() doesn't guarantee
> that buffer_heads are attached (race with vmscan trying to evict the
> page in question can end with buffer_heads freed and page left alive
> and uptodate). Callers do expect buffer_heads to be there, so we either
> need to do create_empty_buffers() in those callers or in ufs_get_locked_page();
> I went for the latter for now.
>
> Off-by-one in ufs_truncate_blocks(): the logics when deciding whether
> we need to do anything with direct blocks is broken when new size is
> within the last direct block. It's better to find the path to the
> last byte _not_ to be removed and use that instead of the path to the
> beginning of the first block to be freed.
>
> I've pushed fixes for those into vfs.git#ufs-fixes; they do need more
> testing before I send a pull request, though.
The 8 patches in the ufs-fixes group were applied to Linux 4.12-rc5.
They seem to work fine with the simple testing that I do.
I tested all 3 BSDs, FreeBSD 11.0, OpenBSD 6.1 and NetBSD 7.1 using 2
filesystems, 44bsd (ufs1) and ufs2.
I found no errors doing a Linux mkdir, copy large file, BSD fsck, Linux rm
large file, rmdir and BSD fsck in any of the 6 combinations.
Doing a "df" on BSD and Linux now match on the counts including the
"Available" counts.
It might be worth testing with ufs filesystems using softdep and/or
journaling. Should the Linux mount command reject such filesystems?
Now that ufs write access is working more or less, we're dangerous.
Powered by blists - more mailing lists