lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 18 Mar 2009 12:13:01 -0400
From:	Jeff Moyer <jmoyer@...hat.com>
To:	Eric Dumazet <dada1@...mosbay.com>
Cc:	Davide Libenzi <davidel@...ilserver.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Benjamin LaHaise <bcrl@...ck.org>,
	Trond Myklebust <trond.myklebust@....uio.no>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-aio <linux-aio@...ck.org>, zach.brown@...cle.com
Subject: Re: [patch] eventfd - remove fput() call from possible IRQ context (2nd rev)

Eric Dumazet <dada1@...mosbay.com> writes:

> Jeff Moyer a écrit :
>> Eric Dumazet <dada1@...mosbay.com> writes:
>> 
>> 
>>>> 	rwfd = open("rwfile", O_RDWR|O_DIRECT);		assert(rwfd != -1);
>>>> 	if (posix_memalign((void **)&buf, getpagesize(), SIZE) < 0) {
>>>> 		perror("posix_memalign");
>>>> 		exit(1);
>>>> 	}
>>>> 	memset(buf, 0x42, SIZE);
>>>>
>>>> 	/* Write test. */
>>>> 	res = io_queue_init(1024, &io_ctx);		assert(res == 0);
>>>> 	io_prep_pwrite(&iocb, rwfd, buf, SIZE, 0);
>>>> 	io_set_eventfd(&iocb, efd);
>>>> 	res = io_submit(io_ctx, 1, iocbs);		assert(res == 1);
>>> yes but io_submit() is blocking. so your close(efd) will come after the release in fs/aio.c
>> 
>> I'm not sure why you think io_submit is blocking.  In my setup, I
>> preallocated the file, and the test code opens it with O_DIRECT.  So,
>> io_submit should return after the dio is issued, and the I/O size is
>> large enough that it should still be outstanding when io_submit returns.
>
> Hmm.. io_submit() is a blocking syscall, this is how I understood fs/aio.c

Hi, Eric,

The whole point of io_submit is to allow you to submit I/O without
waiting for it.  There are known cases where io_submit will block, of
course, such as when we run out of request descriptors.  See the
io_submit.stp script for some examples.[1]

Now, I admit I was testing using an SSD, so I didn't actually notice the
time it took for the 256MB write (!!!).  I tried the reproducer I posted
on my F9 box, and here is the output I get:

BUG: sleeping function called from invalid context at
fs/file_table.c:262
in_atomic():1, irqs_disabled():1
Pid: 0, comm: swapper Not tainted 2.6.27.15-78.2.23.fc9.x86_64 #1

Call Trace:
 <IRQ>  [<ffffffff8103892e>] __might_sleep+0xe7/0xec
 [<ffffffff810bfa86>] __fput+0x35/0x16d
 [<ffffffff810bfbd3>] fput+0x15/0x17
 [<ffffffff810d71bb>] really_put_req+0x34/0x9c
 [<ffffffff810d72f0>] __aio_put_req+0xcd/0xda
 [<ffffffff810d7f77>] aio_complete+0x15d/0x19f
 [<ffffffff810e7016>] dio_bio_end_aio+0x8e/0xa0
 [<ffffffff810e32ab>] bio_endio+0x2a/0x2c
 [<ffffffff8113beae>] req_bio_endio+0x9d/0xba
 [<ffffffff8113c073>] __end_that_request_first+0x1a8/0x2b5
 [<ffffffff8113cb89>] blk_end_io+0x2f/0xa9
 [<ffffffff8113cc2f>] blk_end_request+0xe/0x10
 [<ffffffffa005d30b>] scsi_end_request+0x30/0x90 [scsi_mod]
 [<ffffffffa005d9e9>] scsi_io_completion+0x1aa/0x3b3 [scsi_mod]
 [<ffffffffa0057658>] scsi_finish_command+0xde/0xe7 [scsi_mod]
 [<ffffffffa005de68>] scsi_softirq_done+0xe4/0xed [scsi_mod]
 [<ffffffff8113baa8>] blk_done_softirq+0x7e/0x8e
 [<ffffffff81045146>] __do_softirq+0x7e/0x10c
 [<ffffffff81011bfc>] call_softirq+0x1c/0x28
 [<ffffffff81012e06>] do_softirq+0x4d/0xb0
 [<ffffffff81044d1b>] irq_exit+0x4e/0x9d
 [<ffffffff81013122>] do_IRQ+0x147/0x169
 [<ffffffff81010963>] ret_from_intr+0x0/0x2e
 <EOI>  [<ffffffff810173a9>] ? mwait_idle+0x3e/0x4f
 [<ffffffff810173a0>] ? mwait_idle+0x35/0x4f
 [<ffffffff8100f2a7>] ? cpu_idle+0xb2/0x10b
 [<ffffffff812af35d>] ? rest_init+0x61/0x63

So, I think it is a valid reproducer as it stands.

> Then, using strace -tt -T on your program, I can confirm it is quite a long syscall (3.5 seconds,
> about time needed to write a 256 MB file on my disk ;) )

Did you preallocate the file?

Cheers,
Jeff

[1] http://sourceware.org/systemtap/wiki/ScriptsTools?action=AttachFile&do=view&target=io_submit.stp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ