[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121001231222.GB14533@lenny.home.zabbo.net>
Date: Mon, 1 Oct 2012 16:12:22 -0700
From: Zach Brown <zab@...bo.net>
To: Kent Overstreet <koverstreet@...gle.com>
Cc: linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
tytso@...gle.com, tj@...nel.org,
Dave Kleikamp <dave.kleikamp@...cle.com>,
Dmitry Monakhov <dmonakhov@...nvz.org>,
"Maxim V. Patlasov" <mpatlasov@...allels.com>,
michael.mesnier@...el.com, jeffrey.d.skirvin@...el.com,
Martin Petersen <martin.petersen@...cle.com>
Subject: Re: [RFC, PATCH] Extensible AIO interface
On Mon, Oct 01, 2012 at 03:23:41PM -0700, Kent Overstreet wrote:
> So, I and other people keep running into things where we really need to
> add an interface to pass some auxiliary... stuff along with a pread() or
> pwrite().
Sure. Martin (cc:ed) will sympathize.
> A few examples:
>
> * IO scheduler hints...
> * Cache hints...
>
> * Passing checksums out to userspace. We've got bio integrity, which is
> a (somewhat) generic interface for passing data checksums between the
> filesystem and the hardware.
Hmm, careful here. I think that in DIF/DIX the checksums are
per-sector, not per IO, right? That'd mean that the PAGE_SIZE attr
limit in this patch would be magically creating different max IO size
limits on different architectures. That doesn't seem great.
> Hence, AIO attributes.
I have to be honest: I really don't like tying the interface to AIO, but
I guess it's the only per-io facility we have today. It'd be nice to
include sync O_DIRECT when designing the interface to make sure that it
is possible to use generic syscalls in the future without running up
against unexpected problems.
> An iocb_attr has an id field, and a size field - and some amount of data
> specific to that attribute.
I'd hope that we can come up with a less fragile interface. The kernel
would have to scan the attributes to make sure that there aren't
malicious sizes. I only quickly glanced at the loops, but it seemed
like you could have a 0 size attribute in there and _next() would spin
forever.
- z
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists