[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMBkX3fMBUiQj1YRU90YzyoY0setFeRmJDff5+cu75BpfCM_ug@mail.gmail.com>
Date: Tue, 10 Jan 2012 22:06:57 -0800
From: Surbhi Palande <csurbhi@...il.com>
To: Jan Kara <jack@...e.cz>
Cc: Eric Sandeen <sandeen@...deen.net>,
Kamal Mostafa <kamal@...onical.com>,
Andreas Dilger <adilger.kernel@...ger.ca>,
Randy Dunlap <rdunlap@...otime.net>,
Theodore Tso <tytso@....edu>, linux-ext4@...r.kernel.org,
Valerie Aurora <val@...consulting.com>,
Christopher Chaltain <christopher.chaltain@...onical.com>,
"Peter M. Petrakis" <peter.petrakis@...onical.com>,
Mikulas Patocka <mpatocka@...hat.com>
Subject: Re: [PATCH v2 1/7] Adding support to freeze and unfreeze a journal
(Resending one of the last cases, as the two task scenario got mixed! )
> Now following is the case where the writing
> process/jbd2_journal_start() calls read_lock() *before*
> jbd2_journal_lock_updates() are called by the freezing process.
>
CPU1 CPU2
Task1 (write operation) Task2
tx read_lock(&journal->j_state_lock)
tx+1 jbd2_journal_lock_updates() (will be stuck
at the next instance...)
tx+3 if(journal->j_barrier_count)
write_lock(journal->j_state_lock) /* stuck here till Task1
relinquishes the lock*/
tx+4 handle started and associated with
running transaction.
tx+5 read_unlock(&journal->j_state_lock)
tx+6 jbd2_journal_lock_updates() succeeds
tx+7 jbd2_journal_flush() /* This shall flush the
running transactions */
/* There are no outstanding transcations. No
new ext4_journal_start() will succeed by this point */
> --------------------------------
> Thus:
>
> Though a racy transaction can really be started after SB_FREEZE_WRITE,
> either that transaction will be flushed by jbd2_journal_flush or it
> cannot be started at all because of the barrier count.
>
> So, I do not understand why the journaled writes should really have a
> race window after this patch is applied?
>
Regards,
Surbhi
On Tue, Jan 10, 2012 at 9:38 PM, Surbhi Palande <csurbhi@...il.com> wrote:
> On second thoughts, I fail to see why there is still a race window
> after this patch.
>
> Here are the reasons why i fail to see how the data can be dirtied
> when all the operations involve a journal:
>
> ----------
> So here is the problem that we see
> CPU1 CPU2
> Task1 (write operation) Task2
> ---------------------------------------------------------------------------------------
> t1 ext4_journal_start()
> t2 ext4_journal_start_sb()
> t3 vfs_check_frozen sb->frozen=SB_FREEZE_WRITE
> t4 jbd2_journal_start() /* hence forth all processes calling
> vfs_check_frozen will wait */
>
> Now, our aim is to stop Task1 from dirtying the page cache ie in
> starting this transaction. However if it is successful in starting
> this transaction, then we want to make sure that this transaction is
> flushed out.
> Correct?
>
> (with the journal freeze patch applied)
> * Task1 is now executing the code of jbd2_journal_start(). Task2 is
> the freezing process.
>
> CPU1 CPU2
> Task1 (write operation) Task2
> t4 jbd2_journal_start()
> ...
> tx read_lock(&journal->j_state_lock)
>
> Now two things can happen at this point with respect to Task2:
> a) either journal->j_flags is set to JBD2_FROZEN already
> b) or it is not set.
>
>
> Lets look at both the cases:
> A) When journal->j_flags is set to JBD2_FROZEN already then
> jbd2_journal_start() will get stuck on the waitqueue as long as the
> journal->j_flags is JBD2_FROZEN. So we cannot create dirty data in
> this case.
>
> B) When journal->j_flags is not set to JBD2_ FROZEN then two more
> things could happen:
> Task2 has already finished the call to jbd2_journal_flush() by the
> time Task1 calls read_lock(). So now no new task (T1) should create
> any new dirty data as that will not get flushed out. So we really want
> to stop the jbd2_journal_start() from succeeding.
>
>
> CPU1 CPU2
> Task1 (write operation) Task2
> ----------------------------------------------------------------------------------------------
> tx read_lock(&journal->j_state_lock)
> jbd2_journal_lock_updates() /*journal->j_barrier_count incremented -
> so non zero ! */
> tx+1 jbd2_journal_flush()
> tx+2 write_lock(&journal->j_state_lock);
> /* till the read_lock is relinquished by
> task1 , we are stuck */
> tx+3 if(journal->j_barrier_count)
> read_unlock(&journal->j_state_lock);
>
> /* so we can set the journal->j_flags now as write_lock()
> succeeds here */
> tx+4 goto repeat
>
> The above case is where the writing process/jbd2_journal_start() calls
> read_lock() *after* jbd2_journal_lock_updates() are called by the
> freezing process and hence is protected by the j_barrier_count check.
> This transaction can definitely not start!
>
> Now following is the case where the writing
> process/jbd2_journal_start() calls read_lock() *before*
> jbd2_journal_lock_updates() are called by the freezing process.
>
> However, if Task1 already finished starting the transaction as follows:
> CPU1 CPU2
> Task1 (write operation)
> Task2
> -------------------------------------------------------------------------------------------------------------------------------------------------------------
> tx read_lock(&journal->j_state_lock)
> tx+1
> jbd2_journal_lock_updates() (will be stuck at the next instance...)
> tx+3 if(journal->j_barrier_count)
> write_lock(journal->j_state_lock)
> /* journal->j_barrier_count = 0, so we proceed here*/
> /* stuck here till Task1 relinquishes the lock*/
> tx+4 read_unlock(&journal->j_state_lock);
> tx+5 now start_this_handle() returns successfully
> and associates this handle with the running
> transaction.
> tx+6
> jbd2_journal_lock_updates() succeeds
> tx+7
> jbd2_journal_flush() /* This shall flush the running transactions */
> /* There are no
> outstanding transcations. No new ext4_journal_start() will succeed by
> this point*/
>
> --------------------------------
> Thus:
>
> Though a racy transaction can really be started after SB_FREEZE_WRITE,
> either that transaction will be flushed by jbd2_journal_flush or it
> cannot be started at all because of the barrier count.
>
> So, I do not understand why the journaled writes should really have a
> race window after this patch is applied?
>
> Thanks!
>
> Regards,
> Surbhi
>
>
>
>
>
>
>
> On Tue, Jan 10, 2012 at 4:13 PM, Surbhi Palande <csurbhi@...il.com> wrote:
>> Hi Jan,
>>
>>>>
>>>>
>>>> If all the write operations were journaled, then this patch would not allow
>>>> ext4 filesystem to have any dirty data after its frozen.
>>>> (as journal_start() would block).
>>>>
>>>> I think the only one candidate that creates dirty data without calling
>>>> ext4_journal_start() is mmapped?
>>> No, the problem is in any write path. The problem is with operations
>>> that happen during the phase when s_frozen == SB_FREEZE_WRITE. These
>>> operations dirty the filesystem but running sync may easily miss them.
>>> During this phase journal is not frozen so that does not help you in any
>>> way.
>>>
>>> Honza
>>
>> Ok! No new transaction can really start after the journal is frozen.
>> But we can have dirty data after SB_FREEZE_WRITE and before
>> SB_FREEZE_TRANS.
>> I agree with you. However, can this be fixed by adding a
>> sync_filesystem() in freeze_super() after the sb->s_op->freeze_fs() is
>> over?
>>
>>
>> So then essentially, when freeze_super() returns, the page cache is clean?
>>
>> I do definitely agree that the fix is to add a lock for mutual
>> exclusion between freeze filesystem and writes to a frozen filesystem.
>>
>> Thanks!
>>
>> Regards,
>> Surbhi.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists