lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAK6Zt2icmsBxjdqFvDXfnxZHXuKN3hDSTdDmh7Vhj1iJ_5LXQ@mail.gmail.com>
Date:	Mon, 29 Aug 2011 10:33:24 -0700
From:	Daniel Ehrenberg <dehrenberg@...gle.com>
To:	linux-kernel@...r.kernel.org
Subject: Approaches to making io_submit not block

Hi,

The Linux AIO interface (io_submit, io_getevents, etc) is useful in
allowing multiple requests to through the I/O stack without requiring
a userspace or even kernel thread per pending request. This is really
great for maxing out high-performance devices like SSDs. However, it
seems incomplete to me because io_submit sometimes blocks for a couple
filesystem-related reasons. I'm wondering if this could be fixed, or
if there is an inherent need for this sort of blocking.

- Blocking due to reading metadata.
Proposed solution:
Add a per-ioctx work queue to do metadata reads. It will be triggered
from the dio code: if in async mode, then get_block will be called
with an additional flag, meaning something like O_NONBLOCK on sockets.
File systems' get_block functions can implement this flag and return
-EAGAIN if a read from the underlying device would be necessary. (If
we're worried that EAGAIN might be used for other purposes in the
future, we could make a new errno for this purpose.) From a quick
glance at the code, it looks like this would not be too difficult to
add to ext4 for extent-based files, and support in other file systems
could be added gradually. If -EAGAIN is returned, then the struct dio
will be put on the work queue together with a description of what kind
of processing it was doing. The work queue only serves the metadata
request, and the rest of the request is served on the existing path.

- Blocking for appends and writes to file holes due to the need for a
metadata write after the data write
Proposed solution:
Maintain a work queue for all appends and writes to file holes, which
executes the current code.

Has anything like this been discussed or implemented? What I'm talking
about isn't optimal in terms of parallelism; it just matches the
parallelism of the current approach (with the minor caveat that
multiple threads on the same core calling io_submit on the same ioctx
don't get to run their metadata/append I/O requests concurrently), but
allows the io_submit system call to return to userspace much faster.
I've read about other proposals for general asynchronous syscalls, but
this would be lighter weight in not requiring a kernel task per I/O
request.

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ