lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131108175608.GK6710@lenny.home.zabbo.net>
Date:	Fri, 8 Nov 2013 09:56:08 -0800
From:	Zach Brown <zab@...hat.com>
To:	Kent Overstreet <kmo@...erainc.com>
Cc:	Christoph Hellwig <hch@...radead.org>,
	Dave Kleikamp <dave.kleikamp@...cle.com>,
	Stephen Rothwell <sfr@...b.auug.org.au>,
	Jens Axboe <axboe@...nel.dk>, linux-next@...r.kernel.org,
	linux-kernel@...r.kernel.org, Olof Johansson <olof@...om.net>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: linux-next: manual merge of the block tree with the  tree

> > > That make sense? I can show you more concretely what I'm working on if
> > > you want. Or if I'm full of crap and this is useless for what you guys
> > > want I'm sure you'll let me know :)
> > 
> > It sounds interesting, but also a little confusing at this point, at
> > least from the non-block side of view.
> 
> Zach, you want to chime in? He was involved in the discussion yesterday,
> he might be able to explain this stuff better than I.

I can try.  I may not do the *best* job because I've been on the
periphery of most of this since I left the proof of concept back at
Oracle :).

The first part is passing in pages instead of mapped addresses.  That's
where the iov_iter argument came from.  A ham-fisted proof of concept to
try to abstract iterating over any old type of memory.  But it's not
*really* abstract because dio magically knows (look for gross
iov_iter_has_iovec() callers) whether the memory is in iovecs or bio
pages when its verifying alignment, pinning or not, etc.  In the end
it's little more than syntactic sugar to try and pretend that two
interfaces are one.

For expedience, this iov_iter approach used the loop's bio to store the
pages in the iov_iter rather than translating the bio's pages to a page
array in the iov_iter.

So the first part of what I think Kent is picturing is to take that to
its logical conclusion and have the caller describe the io memory and
offset with a bio instead of explicit address and offset arguments.
This way dio can do nice bio management operations to kick off its
device bios rather than having to clumsily build them from either
incoming pages or mapped user addresses that are hidden in iov_iter.

I'm imagining cutting the current dio up in to two phases.  One that
pins user pages and puts them in bios and one that maps those file bios
to device bios and submits them.  Then the fop method becomes the second
phase so that loop can call it with its file bios.  Call it
->submit_file_bio() instead of ->do_direct_IO(), maybe?

The other part of this series that isn't getting as much attention,
though, is async submission and completion.  This patch introduces a
weird in-kernel aio submission interface that adds special cases to aio.
In this new bio world order we could get rid of that complication by
relying on the bio's ->bi_end_io() for completion.

I suppose a high level view of this strategy is to move more towards a
stack where layers have matching inputs and outputs.  If both dio and
loop take bios as input and translate them into submitted output bios
then the stacking becomes more natural.

That's the blue sky fantasy anyway.  There's a lot of detail being
glossed over.  I want to see what the patches look like.

- z
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ