[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170615080009.GE31671@ZenIV.linux.org.uk>
Date: Thu, 15 Jun 2017 09:00:09 +0100
From: Al Viro <viro@...IV.linux.org.uk>
To: Richard Narron <comet.berkeley@...il.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [git pull] first batch of ufs fixes
On Wed, Jun 14, 2017 at 08:11:33AM +0100, Al Viro wrote:
> NOTE: all I have is your image *after* it had counters buggered; I don't
> know the exact sequence of operations that fucked it in your case. One
> way to trigger it is to mount/umount on OpenBSD, then mount/modify/umount
> on Linux, then mount/umount on OpenBSD, then fsck on OpenBSD. This patch
> apparently fixes that, but your reproducer might be something different.
FWIW, it seems to work here. Said that, *BSD fsck_ffs is not worth much -
play a bit with redundancy in UFS superblock (starting with fragment
and block sizes, their ratio, logarithms, bitmasks, etc.) and you can
screw at least 10.3 into the ground when mounting an image that passes
their fsck. Sure, anyone who mounts untrusted images is a cretin who
deserves everything they get, fsck or no fsck, but... no complaints from
fsck is not a reliable indicator of image being in good condition and
that's PITA for testing.
Another pile of fun: "reserve ->s_minfree percents of total" logics had
been broken.
* using hardwired 5% is wrong - especially for ufs2, where it's
not even the default
* ufs_freespace() returns u64; testing for <= 0 is not doing the
right thing
* no capability checks before we need them, TYVM...
* ufs2 needs 64bit uspi->s_dsize (and ->s_size, while we are at it).
64bit variants were even calculated - and never used.
* while we are at it, doing "multiply the total data frags by
s_minfree and divide by 100" every time we allocate a block is bloody
dumb - that should be calculated once.
We really need to get the sodding tail unpacking moved up from the place
where it's buried - turns out that my doubts about that code managing to
avoid deadlocks had been correct. Long-term we need to move that thing
to iomap-based ->write_iter() and do unpacking there and in truncate().
For now I've slapped together something that is easier to backport -
avoiding ->truncate_mutex when possible and not holding ->s_lock over
ufs_change_blocknr().
Another bug in the same area: ufs_get_locked_page() doesn't guarantee
that buffer_heads are attached (race with vmscan trying to evict the
page in question can end with buffer_heads freed and page left alive
and uptodate). Callers do expect buffer_heads to be there, so we either
need to do create_empty_buffers() in those callers or in ufs_get_locked_page();
I went for the latter for now.
Off-by-one in ufs_truncate_blocks(): the logics when deciding whether
we need to do anything with direct blocks is broken when new size is
within the last direct block. It's better to find the path to the
last byte _not_ to be removed and use that instead of the path to the
beginning of the first block to be freed.
I've pushed fixes for those into vfs.git#ufs-fixes; they do need more
testing before I send a pull request, though.
Powered by blists - more mailing lists