[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130107233443.GG26407@google.com>
Date: Mon, 7 Jan 2013 15:34:43 -0800
From: Kent Overstreet <koverstreet@...gle.com>
To: Jens Axboe <axboe@...nel.dk>
Cc: linux-kernel@...r.kernel.org, linux-aio@...ck.org,
linux-fsdevel@...r.kernel.org, zab@...hat.com, bcrl@...ck.org,
jmoyer@...hat.com, viro@...iv.linux.org.uk, tytso@....edu
Subject: Re: [PATCH 29/32] block, aio: Batch completion for bios/kiocbs
On Fri, Jan 04, 2013 at 10:22:35AM +0100, Jens Axboe wrote:
> On 2012-12-27 03:00, Kent Overstreet wrote:
> > When completing a kiocb, there's some fixed overhead from touching the
> > kioctx's ring buffer the kiocb belongs to. Some newer high end block
> > devices can complete multiple IOs per interrupt, much like many network
> > interfaces have been for some time.
> >
> > This plumbs through infrastructure so we can take advantage of multiple
> > completions at the interrupt level, and complete multiple kiocbs at the
> > same time.
> >
> > Drivers have to be converted to take advantage of this, but it's a
> > simple change and the next patches will convert a few drivers.
> >
> > To use it, an interrupt handler (or any code that completes bios or
> > requests) declares and initializes a struct batch_complete:
> >
> > struct batch_complete batch;
> > batch_complete_init(&batch);
> >
> > Then, instead of calling bio_endio(), it calls
> > bio_endio_batch(bio, err, &batch). This just adds the bio to a list in
> > the batch_complete.
> >
> > At the end, it calls
> >
> > batch_complete(&batch);
> >
> > This completes all the bios all at once, building up a list of kiocbs;
> > then the list of kiocbs are completed all at once.
> >
> > Also, in order to batch up the kiocbs we have to add a different
> > bio_endio function to struct bio, that takes a pointer to the
> > batch_complete - this patch converts the dio code's bio_endio function.
> > In order to avoid changing every bio_endio function in the kernel (there
> > are many), we currently use a union and a flag to indicate what kind of
> > bio endio function to call. This is admittedly a hack, but should
> > suffice for now.
>
> It is indeed a hack... Famous last words as well, I'm sure that'll stick
> around forever if it goes in! Any ideas on how we can clean this up
> before that?
Well, I wouldn't _really_ mind changing all 200 bi_end_io uses. On the
other hand, the majority of them are either leaf nodes (filesystem code
and whatnot that's not completing anything else that could be batched),
or stuff like the dm and md code where it could be plumbed through (so
we could batch completions through md/dm) but it may take some thought
to do it right.
So I think I'd prefer to do it incrementally, for the moment. I'm always
a bit terrified of doing a cleanup that touches 50+ files, and then
changing my mind about something and going back and redoing it.
That said, I haven't forgotten about all the other block layer patches
I've got for you, as soon as I'm less swamped I'm going to finish off
that stuff so I should be around to revisit it...
> Apart from that, I think the batching makes functional sense. For the
> devices where we do get batches of completions (most of them), it's the
> right thing to do. Would be nice it were better integrated though, not a
> side hack.
>
> Is the rbtree really faster than a basic (l)list and a sort before
> completing them? Would be simpler.
Well, depends. With one or two kioctxs? The list would definitely be
faster, but I'm loathe to use an O(n^2) algorithm anywhere where the
input size isn't strictly controlled, and I know of applications out
there that use tons of kioctxs.
> A few small comments below.
>
> > +void bio_endio_batch(struct bio *bio, int error, struct batch_complete *batch)
> > +{
> > + if (error)
> > + bio->bi_error = error;
> > +
> > + if (batch)
> > + bio_list_add(&batch->bio, bio);
> > + else
> > + __bio_endio(bio, batch);
> > +
> > +}
>
> Ugh, get rid of this 'batch' checking.
The reason I did it that way is - well, look at the dio code's bi_end_io
function. It's got to be passed a pointer to a struct batch_complete *
to batch kiocbs, but the driver that calls it may or may not have batch
completions plumbed through.
So unless every single driver gets converted (and I think that'd be
silly for all the ones that can't do any actual batching) something's
going to have to have that check, and better for it to be in generic
code than every mid layer code we plumb it through.
>
> > +static inline void bio_endio(struct bio *bio, int error)
> > +{
> > + bio_endio_batch(bio, error, NULL);
> > +}
> > +
>
> Just make that __bio_endio().
That one could be changed... I dislike having the if (error)
bio->bi_error = error duplicated...
Actually, it'd probably make more sense to inline bio_endio_batch(),
because often the compiler is going to either know whether batch is null
or not or be able to lift it out of a loop.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists