[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <x49zk44ojpe.fsf@segfault.boston.devel.redhat.com>
Date: Tue, 02 Oct 2012 13:41:17 -0400
From: Jeff Moyer <jmoyer@...hat.com>
To: Kent Overstreet <koverstreet@...gle.com>
Cc: linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
tytso@...gle.com, tj@...nel.org,
Dave Kleikamp <dave.kleikamp@...cle.com>,
Zach Brown <zab@...bo.net>,
Dmitry Monakhov <dmonakhov@...nvz.org>,
"Maxim V. Patlasov" <mpatlasov@...allels.com>,
michael.mesnier@...el.com, jeffrey.d.skirvin@...el.com
Subject: Re: [RFC, PATCH] Extensible AIO interface
Kent Overstreet <koverstreet@...gle.com> writes:
> So, I and other people keep running into things where we really need to
> add an interface to pass some auxiliary... stuff along with a pread() or
> pwrite().
>
> A few examples:
>
> * IO scheduler hints. Some userspace program wants to, per IO, specify
> either priorities or a cgroup - by specifying a cgroup you can have a
> fileserver in userspace that makes use of cfq's per cgroup bandwidth
> quotas.
You can do this today by splitting I/O between processes and placing
those processes in different cgroups. For io priority, there is
ioprio_set, which incurs an extra system call, but can be used. Not
elegant, but possible.
> * Cache hints. For bcache and other things, userspace may want to specify
> "this data should be cached", "this data should bypass the cache", etc.
Please explain how you will differentiate this from posix_fadvise.
> * Passing checksums out to userspace. We've got bio integrity, which is
> a (somewhat) generic interface for passing data checksums between the
> filesystem and the hardware. There are various circumstances under which
> you may want to pass these checksums out to userspace, and if so we
> ought to have a generic way of doing it.
Yes, that needs a new interface.
> Hence, AIO attributes.
*No.* Start with the non-AIO case first.
> * FUTURE STUFF:
>
> Return values:
>
> Some attributes are probably going to want to return something to
> userspace.
>
> If nothing else, we want this so that userspace can tell if anything
> handled the attributes it specified - as dynamic as the io stack can be,
> with something extensible like this there really isn't any generic way
> of knowing ahead of time if something is going to interpret any
> attribute - we want to return at least an error code.
Seems odd to me. Why not expose supported attributes via some other
call? fcntl?
> One could imagine sticking the return in the attribute itself, but I
> don't want to do this. For some things (checksums), the attribute will
> contain a pointer to a buffer - that's fine. But I don't want the
> attributes themselves to be writeable.
One could imagine that attributes don't return anything, because, well,
they're properties of something else, and properties don't return
anything.
Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists