[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121003012825.GX23520@dastard>
Date: Wed, 3 Oct 2012 11:28:25 +1000
From: Dave Chinner <david@...morbit.com>
To: Kent Overstreet <koverstreet@...gle.com>
Cc: Jeff Moyer <jmoyer@...hat.com>, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, tytso@...gle.com, tj@...nel.org,
Dave Kleikamp <dave.kleikamp@...cle.com>,
Zach Brown <zab@...bo.net>,
Dmitry Monakhov <dmonakhov@...nvz.org>,
"Maxim V. Patlasov" <mpatlasov@...allels.com>,
michael.mesnier@...el.com, jeffrey.d.skirvin@...el.com,
pjt@...gle.com
Subject: Re: [RFC, PATCH] Extensible AIO interface
On Tue, Oct 02, 2012 at 05:20:29PM -0700, Kent Overstreet wrote:
> On Tue, Oct 02, 2012 at 01:41:17PM -0400, Jeff Moyer wrote:
> > Kent Overstreet <koverstreet@...gle.com> writes:
> >
> > > So, I and other people keep running into things where we really need to
> > > add an interface to pass some auxiliary... stuff along with a pread() or
> > > pwrite().
> > >
> > > A few examples:
> > >
> > > * IO scheduler hints. Some userspace program wants to, per IO, specify
> > > either priorities or a cgroup - by specifying a cgroup you can have a
> > > fileserver in userspace that makes use of cfq's per cgroup bandwidth
> > > quotas.
> >
> > You can do this today by splitting I/O between processes and placing
> > those processes in different cgroups. For io priority, there is
> > ioprio_set, which incurs an extra system call, but can be used. Not
> > elegant, but possible.
>
> Yes - those are things I'm trying to replace. Doing it that way is a
> real pain, both as it's a lousy interface for this and it does impact
> performance (ioprio_set doesn't really work too well with aio, too).
>
> > > * Cache hints. For bcache and other things, userspace may want to specify
> > > "this data should be cached", "this data should bypass the cache", etc.
> >
> > Please explain how you will differentiate this from posix_fadvise.
>
> Oh sorry, I think about SSD caching so much I forget to say that's what
> I'm talking about. posix_fadvise is for the page cache, we want
> something different for an SSD cache (IMO it'd be really ugly to use it
> for both, and posix_fadvise() can't really specifify everything we'd
> want to for an SSD cache).
Similar discussions about posix_fadvise() are being had for marking
ranges of files as volatile (i.e. useful for determining what can be
evicted from a cache when space reclaim is required).
https://lkml.org/lkml/2012/10/2/501
If you have requirements for specific cache management, then it
might be worth seeing if you can steer an existing interface
proposal for some form of cache management in the direction you
need.
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists