lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <B41635854730A14CA71C92B36EC22AAC8723A6@mssmsx411>
Date:	Wed, 21 Feb 2007 22:15:00 +0300
From:	"Ananiev, Leonid I" <leonid.i.ananiev@...el.com>
To:	"Zach Brown" <zach.brown@...cle.com>,
	"Ken Chen" <kenchen@...gle.com>
Cc:	"Chris Mason" <chris.mason@...cle.com>,
	"linux-aio" <linux-aio@...ck.org>,
	"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
	"Benjamin LaHaise" <bcrl@...ck.org>,
	"Suparna bhattacharya" <suparna@...ibm.com>,
	"Andrew Morton" <akpm@...ux-foundation.org>,
	"Badari Pulavarty" <pbadari@...ibm.com>
Subject: RE: [PATCH 2/2] aio: propogate post-EIOCBQUEUED errors to completion event

> kjournald submited buffers for IO and waiting for them to finish.

It is means that the patch incorrectly moves internal kernel
synchronization
problem into user space as EIO instead of fixing a root cause or perform
iterative
synchronization.
After patching users will be surprised a lot of EIO
(remember 47% EIO in aio-stress runs after patching) will buy new
disk...
This patch is harmful.

Leonid
>-----Original Message-----
>From: Zach Brown [mailto:zach.brown@...cle.com]
>Sent: Wednesday, February 21, 2007 9:25 PM
>To: Ken Chen
>Cc: Ananiev, Leonid I; Chris Mason; linux-aio; Linux Kernel Mailing
List; Benjamin LaHaise; Suparna
>bhattacharya; Andrew Morton; Badari Pulavarty
>Subject: Re: [PATCH 2/2] aio: propogate post-EIOCBQUEUED errors to
completion event
>
>
>On Feb 21, 2007, at 12:35 AM, Ken Chen wrote:
>
>> On 2/20/07, Ananiev, Leonid <leonid.i.ananiev@...el.com> wrote:
>>> 1) mem=1G in kernel boot param if you have more
>>> 2) unmount; mk2fs; mount
>>> 3) dd if=/dev/zero of=<test_file> bs=1M count=1200
>>> 4) aiostress -s 1200m -O -o 2 -i 1 -r 16k <test_file>
>>> 5) if i++<50 goto 2).
>>
>> Would you please instrument the call chain of
>> invalidate_complete_page2() and tell us exactly where it returns zero
>> value in your failure case?
>>
>>   invalidate_complete_page2
>>      try_to_release_page
>>         ext3_releasepage
>>            journal_try_to_free_buffers
>>               ???
>
>For what it's worth, Badari has explained this race in the past in a
>credible way.  I'll take the liberty of pasting a mail from him:
>
>"
>kjournald submited buffers for IO and waiting for them to finish.
>Note that it has a ref. against the buffer.
>
>journal_commit_transaction()
>         ...
>         submited buffers for IO
>         /* Waiting for IO to complete */
>         while (commit_transaction->t_locked_list) {
>                 ...
>                 get_bh(bh);
>                 if (buffer_locked(bh)) {
>                         spin_unlock(&journal->j_list_lock);
>                         wait_on_buffer(bh);  <<<<<<
>                         spin_lock(&journal->j_list_lock);
>                 }
>
>                 ..
>                 put_bh(bh);
>         }
>
>Now, DIO process comes to frees the jh through
>journal_try_to_free_buffers()
>but fails to drop_buffers() since kjournald() has a reference against
>it.
>invalidate_inode_pages2_range()
>         ..
>         ext3_releasepage()
>                 journal_try_to_free_buffers()
>                         journal_put_journal_head()
>                                 __journal_try_to_free_buffer()
>                                         <--- freed jh
>
>                         try_to_free_buffers()
>                                 drop_buffers()
>                                         if (buffer_busy(bh))
>                                                 goto failed;
>                                           <<--- returns EIO due to
>b_count
>
>"
>
>I don't mean to say that we shouldn't get traces to confirm the
>theory, just sharing.  And now we can point to this in the archives
>next time :).
>
>- z
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ