[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <16a6ed55-e8c9-cc56-b675-e9a9de57ab66@digidescorp.com>
Date: Sat, 30 Mar 2019 14:49:46 -0500
From: Steve Magnani <steve.magnani@...idescorp.com>
To: Jan Kara <jack@...e.cz>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
linux-fsdevel@...r.kernel.org
Subject: Re: Possible UDF locking error?
Jan -
On 3/25/19 11:42 AM, Jan Kara wrote:
> Hi!
>
> On Sat 23-03-19 15:14:05, Steve Magnani wrote:
>> I have been hunting a UDF bug that occasionally results in generation
>> of an Allocation Extent Descriptor with an incorrect tagLocation. So
>> far I haven't been able to see a path through the code that could
>> cause that. But, I noticed some inconsistency in locking during
>> AED generation and wonder if it could result in random corruption.
>>
>> The function udf_update_inode() has this general pattern:
>>
>> bh = udf_tgetblk(...); // calls sb_getblk()
>> lock_buffer(bh);
>> memset(bh->b_data, 0, inode->i_sb->s_blocksize);
>> // <snip>other code to populate FE/EFE data in the block</snip>
>> set_buffer_uptodate(bh);
>> unlock_buffer(bh);
>> mark_buffer_dirty(bh);
>>
>> This I can understand - the lock is held for as long as the buffer
>> contents are being assembled.
>>
>> In contrast, udf_setup_indirect_aext(), which constructs an AED,
>> has this sequence:
>>
>> bh = udf_tgetblk(...); // calls sb_getblk()
>> lock_buffer(bh);
>> memset(bh->b_data, 0, inode->i_sb->s_blocksize);
>>
>> set_buffer_uptodate(bh);
>> unlock_buffer(bh);
>> mark_buffer_dirty_inode(bh);
>>
>> // <snip>other code to populate AED data in the block</snip>
>>
>> In this case the population of the block occurs without
>> the protection of the lock.
>>
>> Because the block has been marked dirty, does this mean that
>> writeback could occur at any point during population?
> Yes. Thanks for noticing this!
>
>> There is one path through udf_setup_indirect_aext() where
>> mark_buffer_dirty_inode() gets called again after population is
>> complete, which I suppose could heal a partial writeout, but there is
>> also another path in which the buffer does not get marked dirty again.
> Generally, we add new extents to the created indirect extent which dirties
> the buffer and that should fix the problem. But you are definitely right
> that the code is suspicious and should be fixed. Will you send a patch?
I did a little archaeology to see how the code evolved to this point.
It's been like this a long time.
I also did some research to understand why filesystems use lock_buffer()
sometimes but not others. For example, the FAT driver never calls it. I
ran across this thread from 2011:
https://lkml.org/lkml/2011/5/16/402
...from which I conclude that while it is correct in a strict sense to
hold a lock on a buffer any time its contents are being modified,
performance considerations make it preferable (or at least reasonable)
to make some modifications without a lock provided it's known that a
subsequent write-out will "fix" any potential partial write out before
anyone else tries to read the block. I doubt that UDF sees common use
with DIF/DIX block devices, which might make a decision in favor of
performance a little easier. Since the FAT driver doesn't contain
Darrick's proposed changes I assume a decision was made that performance
was more important there.
Certainly the call to udf_setup_indirect_aext() from udf_add_aext()
meets those criteria. But udf_table_free_blocks() may not dirty the AED
block.
So if this looks reasonable I will resend as a formal patch:
--- a/fs/udf/inode.c 2019-03-30 11:28:38.637759458 -0500
+++ b/fs/udf/inode.c 2019-03-30 11:33:00.357761250 -0500
@@ -1873,9 +1873,6 @@ int udf_setup_indirect_aext(struct inode
return -EIO;
lock_buffer(bh);
memset(bh->b_data, 0x00, sb->s_blocksize);
- set_buffer_uptodate(bh);
- unlock_buffer(bh);
- mark_buffer_dirty_inode(bh, inode);
aed = (struct allocExtDesc *)(bh->b_data);
if (!UDF_QUERY_FLAG(sb, UDF_FLAG_STRICT)) {
@@ -1890,6 +1887,9 @@ int udf_setup_indirect_aext(struct inode
udf_new_tag(bh->b_data, TAG_IDENT_AED, ver, 1, block,
sizeof(struct tag));
+ set_buffer_uptodate(bh);
+ unlock_buffer(bh);
+
nepos.block = neloc;
nepos.offset = sizeof(struct allocExtDesc);
nepos.bh = bh;
@@ -1913,6 +1913,8 @@ int udf_setup_indirect_aext(struct inode
} else {
__udf_add_aext(inode, epos, &nepos.block,
sb->s_blocksize | EXT_NEXT_EXTENT_ALLOCDECS, 0);
+ /* Make sure completed AED gets written out */
+ mark_buffer_dirty_inode(nepos.bh, inode);
}
brelse(epos->bh);
------------------------------------------------------------------------
Steven J. Magnani "I claim this network for MARS!
www.digidescorp.com Earthling, return my space modulator!"
#include <standard.disclaimer>
Powered by blists - more mailing lists