[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <4FF990C3-F998-4003-83D4-91CAE76FCDBE@oracle.com>
Date: Wed, 21 Feb 2007 10:24:56 -0800
From: Zach Brown <zach.brown@...cle.com>
To: Ken Chen <kenchen@...gle.com>
Cc: "Ananiev, Leonid I" <leonid.i.ananiev@...el.com>,
Chris Mason <chris.mason@...cle.com>,
linux-aio <linux-aio@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Benjamin LaHaise <bcrl@...ck.org>,
Suparna bhattacharya <suparna@...ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Badari Pulavarty <pbadari@...ibm.com>
Subject: Re: [PATCH 2/2] aio: propogate post-EIOCBQUEUED errors to completion event
On Feb 21, 2007, at 12:35 AM, Ken Chen wrote:
> On 2/20/07, Ananiev, Leonid <leonid.i.ananiev@...el.com> wrote:
>> 1) mem=1G in kernel boot param if you have more
>> 2) unmount; mk2fs; mount
>> 3) dd if=/dev/zero of=<test_file> bs=1M count=1200
>> 4) aiostress -s 1200m -O -o 2 -i 1 -r 16k <test_file>
>> 5) if i++<50 goto 2).
>
> Would you please instrument the call chain of
> invalidate_complete_page2() and tell us exactly where it returns zero
> value in your failure case?
>
> invalidate_complete_page2
> try_to_release_page
> ext3_releasepage
> journal_try_to_free_buffers
> ???
For what it's worth, Badari has explained this race in the past in a
credible way. I'll take the liberty of pasting a mail from him:
"
kjournald submited buffers for IO and waiting for them to finish.
Note that it has a ref. against the buffer.
journal_commit_transaction()
...
submited buffers for IO
/* Waiting for IO to complete */
while (commit_transaction->t_locked_list) {
...
get_bh(bh);
if (buffer_locked(bh)) {
spin_unlock(&journal->j_list_lock);
wait_on_buffer(bh); <<<<<<
spin_lock(&journal->j_list_lock);
}
..
put_bh(bh);
}
Now, DIO process comes to frees the jh through
journal_try_to_free_buffers()
but fails to drop_buffers() since kjournald() has a reference against
it.
invalidate_inode_pages2_range()
..
ext3_releasepage()
journal_try_to_free_buffers()
journal_put_journal_head()
__journal_try_to_free_buffer()
<--- freed jh
try_to_free_buffers()
drop_buffers()
if (buffer_busy(bh))
goto failed;
<<--- returns EIO due to
b_count
"
I don't mean to say that we shouldn't get traces to confirm the
theory, just sharing. And now we can point to this in the archives
next time :).
- z
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists