[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <6.0.0.20.2.20080806153517.04146a50@172.19.0.2>
Date: Wed, 06 Aug 2008 15:55:47 +0900
From: Hisashi Hifumi <hifumi.hisashi@....ntt.co.jp>
To: Mingming Cao <cmm@...ibm.com>, Chris Mason <chris.mason@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, jack@...e.cz,
linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH] jbd jbd2: fix dio write returning
EIOwhentry_to_release_page fails
>> > >> > diff -Nrup linux-2.6.27-rc1.org/fs/jbd/transaction.c
>> > >linux-2.6.27-rc1/fs/jbd/transaction.c
>> > >> > --- linux-2.6.27-rc1.org/fs/jbd/transaction.c 2008-07-29
>> > >19:28:47.000000000 +0900
>> > >> > +++ linux-2.6.27-rc1/fs/jbd/transaction.c 2008-07-29
>20:40:12.000000000 +0900
>> > >> > @@ -1764,6 +1764,12 @@ int journal_try_to_free_buffers(journal_
>> > >> > */
>> > >> > if (ret == 0 && (gfp_mask & __GFP_WAIT) && (gfp_mask & __GFP_FS)) {
>> > >> > journal_wait_for_transaction_sync_data(journal);
>> > >> > +
>> > >> > + bh = head;
>> > >> > + do {
>> > >> > + while (atomic_read(&bh->b_count))
>> > >> > + schedule();
>> > >> > + } while ((bh = bh->b_this_page) != head);
>> > >> > ret = try_to_free_buffers(page);
>> > >> > }
>> > >>
>> > >> The loop is problematic. If the scheduler decides to keep running this
>> > >> task then we have a busy loop. If this task has realtime policy then
>> > >> it might even lock up the kernel.
>> > >>
>> > >
>> > >ocfs2 calls journal_try_to_free_buffers too, looping on b_count might
>> > >not be the best idea there either.
>> > >
>> > >This code gets called from releasepage, which is used other places than
>> > >the O_DIRECT invalidation paths, I'd be worried about performance
>> > >problems here.
>> > >
>> >
>> > try_to_release_page has gfp_mask parameter. So when try_to_releasepage
>> > is called from performance sensitive part, gfp_mask should not be set.
>> > b_count check loop is inside of (gfp_mask & __GFP_WAIT) && (gfp_mask &
>__GFP_FS) check.
>>
>> Looks like try_to_free_pages will go into releasepage with wait & fs
>> both set. This kind of change would make me very nervous.
>>
>
>Hi Chris,
>
>The gfp_mask try_to_free_pages() takes from it's caller will past it
>down to try_to_release_page(). Based on the meaning of __GFP_WAIT and
>GFP_FS, if the upper level caller set these two flags, I assume the
>upper level caller expect delay and wait for fs to finish?
>
>
>But I agree that using a loop in journal_try_to_free_buffers() to wait
>for the busy bh release the counter is expensive...
I modified my patch.
I do not change Checking b_count in a loop, but introduce
set_current_state(TASK_UNINTERRUPTIBLE) to mitigate the loop. I think this can
lead to avoid busy loop.
I used the same approach of do_sync_read()->wait_on_retry_sync_kiocb or some drivers(qla2xxx).
Signed-off-by: Hisashi Hifumi <hifumi.hisashi@....ntt.co.jp>
diff -Nrup linux-2.6.27-rc1.org/fs/jbd/transaction.c linux-2.6.27-rc1.jbdfix/fs/jbd/transaction.c
--- linux-2.6.27-rc1.org/fs/jbd/transaction.c 2008-07-29 19:28:47.000000000 +0900
+++ linux-2.6.27-rc1.jbdfix/fs/jbd/transaction.c 2008-08-06 13:35:37.000000000 +0900
@@ -1764,6 +1764,15 @@ int journal_try_to_free_buffers(journal_
*/
if (ret == 0 && (gfp_mask & __GFP_WAIT) && (gfp_mask & __GFP_FS)) {
journal_wait_for_transaction_sync_data(journal);
+
+ bh = head;
+ do {
+ while (atomic_read(&bh->b_count)) {
+ set_current_state(TASK_UNINTERRUPTIBLE);
+ schedule();
+ __set_current_state(TASK_RUNNING);
+ }
+ } while ((bh = bh->b_this_page) != head);
ret = try_to_free_buffers(page);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists