[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4DDCEF4D.1070107@canonical.com>
Date: Wed, 25 May 2011 15:00:13 +0300
From: Surbhi Palande <surbhi.palande@...onical.com>
To: Ted Ts'o <tytso@....edu>
CC: sandeen@...hat.com, jack@...e.cz, marco.stornelli@...il.com,
adilger.kernel@...ger.ca, toshi.okajima@...fujitsu.com,
m.mizuma@...fujitsu.com, linux-ext4@...r.kernel.org,
linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH] Attempt to sync the fsstress writes to a frozen F.S
Hi Ted,
On 05/25/2011 12:42 AM, Ted Ts'o wrote:
> On Wed, May 11, 2011 at 10:10:41AM +0300, Surbhi Palande wrote:
>> While the fsstress background writes are busy dirtying the page cache, if a
>> fsfreeze happens then the background writes should stall. A sync should then
>> not have any data to sync to the FS. If it does have any data to sync then
>> sync will cause a deadlock by holding the s_umount write semaphore and waiting
>> in the wait queue for the FS to thaw, whereas the F.S can never thaw without
>> getting the s_umount write semaphore.
>>
>> Signed-off-by: Surbhi Palande<surbhi.palande@...onical.com>
>
> Hi Surbhi,
>
> Have you tried out Jan Kara's patches?
>
> [1/3] fs: Create __block_page_mkwrite() helper passing error values back
> [2/3] vfs: Block mmapped writes while the fs is frozen
> [3/3] ext4: Rewrite ext4_page_mkwrite() to return locked page
Yes! We have tried these patches and we still see the same
deadlock/hang. The following is the reason for it:
// lets assume the inode is clean and so are its pages.
P1: process that tries mmap write
t1) __do_fault()
t2) ext4_page_mkwrite()
t3) block_page_mkwrite()
t4) vfs_check_frozen()
// filesystem is not frozen so control falls through.
t5) __block_page_mkwrite()
t6) set_page_dirty()
t7) __set_page_dirty()
t8) radix_tree_tag_set(PAGECACHE_TAG_DIRTY)
// page is dirtied, but inode is yet clean.
---------------------- Pre-empted-----------------
P2: freeze process
t9) freeze_super()
t10) sync_filesystem()
// page cache now clean! no inode is dirty.
// however we have a dirty page belonging to a clean inode.
----------------------Freeze process finishes, filesystem frozen!----
P1: process that tries mmap write gets control.
t11) __set_page_dirty() // gets control back
t12) __mark_inode_dirty()v
// inode is now dirty and it has a dirty page.
// though in reality there is no write which has occured.
t13) if (inode->i_sb->s_frozen != SB_UNFROZEN)
// __block_page_mkwrite() gets control back
t14) unlock_page()
t15) __block_page_mkwrite() returns -EAGAIN
t16) block_page_mkwrite() returns VM_FAULT_RETRY
---------------------------
// now we see the original deadlock reported.
P3: sync a filesystem
t17) down_read(s_umount)
t18) sync_filesystem()
t19) sb->s_op->sync_fs() // =ext4_sync_fs()
t20) vfs_check_frozen() // now blocks for thaw.
// so thaw cannot happen because sync process sleeps with s_umount!
This deadlock can occur whenever the freeze happens after the
vfs_check_frozen() but before the __mark_inode_dirty().
We see blocked sync processes every time we do the following:
1) executing iozone on multipath and
2) I modified the script that Toshiyuki sent, attaching it here. This
script reproduces the bug faster when executed with iozone.
(Note, that since this is a race, this script _may not_ always produce
it on its own)
I also found one more missing piece in the "Add support to freeze and
unfreeze journal":
1) Call jdb2_journal_thaw() from ext4_unfreeze() to restart the
transactions.
I shall send a patch for the same as a reply to this email again.
Thanks!
Warm Regards,
Surbhi.
P3: sync
>
> Do these patches fix the problem you've been trying to fix with your
> patches? I believe they should, but I would appreciate confirmation
> that with these patches, you're no longer able to reproduce the
> problem you've been concerned about.
>
> Thanks, regards,
>
> - Ted
Download attachment "test.sh" of type "application/x-sh" (2747 bytes)
Powered by blists - more mailing lists