[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191104130823.GC28764@mit.edu>
Date: Mon, 4 Nov 2019 08:08:23 -0500
From: "Theodore Y. Ts'o" <tytso@....edu>
To: Jan Kara <jack@...e.cz>
Cc: linux-ext4@...r.kernel.org
Subject: Re: [PATCH 21/22] ext4: Reserve revoke credits for freed blocks
On Wed, Oct 23, 2019 at 06:13:14PM +0200, Jan Kara wrote:
> > It would probably be better to push this up to the callers, since we
> > can get the exact number by calculating
> >
> > (EXT4_B2C(sbi, last) - EXT4_B2C(sbi, first) + 1) * sbi->s_cluster_ratio
> >
> > This is a bit more complicated in fs/ext4/indirect.c, where we
> > probably will need to do a min of the these two formulas.
>
> Is it worth the complexity at the callers? If we don't use some reserved
> revoke credits, we'll just return them back. And the truncate code
> generally works one extent at a time so in the end we may have just asked
> for 1 more descriptor block than strictly necessary while the handle is
> running...
Sure, this is a change we can make later if we think it's necessary.
Bigalloc file systems aren't that common, and when they are used, most
of the time people aren't creating large numbers of small files and/or
directories.
> Yes, I was thinking about the same. Extent format of revoke blocks would
> certainly reduce the number of revoke descriptor blocks in the average
> case. On the other hand I think that especially large directories can be
> pretty fragmented so it isn't clear how big the average win would be. And
> as you say the worst case estimate would not really change substantially
> with the different format so to make the filesystem resistent to malicious
> attacker we need some form of reservation of revoke descriptor blocks
> anyway. So in the end I've decided to go without on-disk format change for
> now.
Adding a new on-disk journal format is easier than making other ext4
format changes, since the journal is transient, and the case where the
user is simultaneously (a) rolling back to an older kernel which might
not support the new journal feature, and (b) crashes so that journal
replay is necessary, and (c) it's the root file system, so e2fsck
can't take care of the journal replay is a pretty rare / edge case.
That being said, we can set that aside as a possible later
enhancement. I suspect the main place we would have the large
contiguous range fo blocks to be revoked is the data=journal case, and
one of the things I keep wondering about how much is it worth it to
keep that code. So long as it's not posing a code maintenance burden,
I don't mind that much; but I also wonder how many people are actually
using it in practice.
Out of curiosity, how easily were you able to trigger the revoke
overflow situation using normal directories? I would have expected it
would have been fairly difficult to do, except for large file
deletions with data=journal?
- Ted
Powered by blists - more mailing lists