lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 16 Feb 2007 02:32:11 +0300
From:	"Ananiev, Leonid I" <leonid.i.ananiev@...el.com>
To:	"Zach Brown" <zach.brown@...cle.com>
Cc:	"Ken Chen" <kenchen@...gle.com>, <suparna@...ibm.com>,
	"Andrew Morton" <akpm@...ux-foundation.org>,
	<linux-kernel@...r.kernel.org>, "linux-aio" <linux-aio@...ck.org>,
	"Chris Mason" <chris.mason@...cle.com>
Subject: RE: [PATCH] aio: fix kernel bug when page is temporally busy

>> If EIOCBRETRY then generic_file_aio_write() will be recalled for the
>> same iocb.
> Only if kick_iocb() is called.  It won't be called if i_i_p2_r() was  
> the only thing to return -EIOCBRETRY.
It is not need to call kick_iocb()
for generic_file_aio_write() calling.
It is recalled without any wakeup waiting:
        for (;;) {
                ret = filp->f_op->aio_write(&kiocb, &iov, 1,
kiocb.ki_pos);
                if (ret != -EIOCBRETRY)
                        break;
                wait_on_retry_sync_kiocb(&kiocb);
        }
Note: wait_on_retry_sync_kiocb() does not wait.
That is for dio. For aio iocb generic_file_aio_write() call
is required from ki_run_list while next io_submit or
read_events() is called.
So when an IO hang may happen?

>> It overwrites -EIOCBQUEUED
Do you mean that there is one more kernel bug which
overwrites -EIOCBQUEUED by any errno or number of bytes and this
new value is returned to caller as an IO result
while IO is not finished yet.

The proposed patch does not crate this bug if any.
It actually fixes a kernel panic bag when iocb.users count becomes
incorrect. The bag " Kernel BUG at fs/aio.c:509" is there because
aio_run_iocb() have not a chance to differ real EIO and
EIO which is actually means EAGAYN or EIOCBRETRY.
I'me sure the patch changes source code in correct direction:
to differentiate that two kinds of EIOs.

> Have you read the giant comment over the definition of struct kiocb  
> in include/linux/aio.h?
I have read. But compiler has not: it did not create an object code for
* If ki_retry returns -EIOCBRETRY ...

>>> This can lead to reference count confusion.
>> But just reference count confusion was deleted by patch. Isn't it?
>Sorry, I don't understand what you're trying to ask here.
One of reference count iocb.users confusion is deleted by the patch.
I'm not sure that there is other bag.

At least I have not see IO hang while testing.
It is interesting that I've not seen any EIOCBQUEUED returned
to aio_run_iocb() during 5 hours aiostress running.
Does it mean that EIOCBQUEUED is always reset and never returned?

Leonid

-----Original Message-----
From: Zach Brown [mailto:zach.brown@...cle.com] 
Sent: Thursday, February 15, 2007 10:23 PM
To: Ananiev, Leonid I
Cc: Ken Chen; suparna@...ibm.com; Andrew Morton;
linux-kernel@...r.kernel.org; linux-aio; Chris Mason
Subject: Re: [PATCH] aio: fix kernel bug when page is temporally busy


On Feb 15, 2007, at 11:11 AM, Ananiev, Leonid I wrote:

>> It returns -EIOCBRETRY without guaranteeing that kick_iocb() will be
>> called.  This can lead to operations hanging
>
> If EIOCBRETRY then generic_file_aio_write() will be recalled for the
> same iocb.

Only if kick_iocb() is called.  It won't be called if i_i_p2_r() was  
the only thing to return -EIOCBRETRY.

>
>> It overwrites -EIOCBQUEUED, leading to an aio_complete() while a
>> retry is happening.
>
> EIOCBQUEUED or EIOCBRETRY does not lead to aio_complete() call:

Not by fs/aio.c, but *by the place that originated -EIOCBQUEUED*.   
Later.  After IO has completed.  see fs/direct-io.c:dio_bio_end_aio().

This is what -EIOCBQUEUED means!  It's a promise to call aio_complete 
() in the future.

Have you read the giant comment over the definition of struct kiocb  
in include/linux/aio.h?

>> This can lead to reference count confusion.
> But just reference count confusion was deleted by patch. Isn't it?

Sorry, I don't understand what you're trying to ask here.

- z
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ