[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110610091427.GB4183@redhat.com>
Date: Fri, 10 Jun 2011 05:14:27 -0400
From: Vivek Goyal <vgoyal@...hat.com>
To: Tao Ma <tm@....ma>
Cc: linux-kernel@...r.kernel.org, Jens Axboe <axboe@...nel.dk>
Subject: Re: CFQ: async queue blocks the whole system
On Fri, Jun 10, 2011 at 01:48:37PM +0800, Tao Ma wrote:
[..]
> >> btw, reverting the patch doesn't work. I can still get the livelock.
What test exactly you are running. I am primarily interested in whether
you still get the hung task timeout warning where a writer is waiting on
get_request_wait() for more than 120 secods or not.
Livelock might be a different problem and for which Christoph provided
a patch for XFS.
> >
> > Can you give following patch a try and see if it helps. On my system this
> > does allow CFQ to dispatch some writes once in a while.
> Sorry, this patch doesn't work in my test.
Can you give me backtrace of say 15 seconds each with and without patch.
I think now we must be dispatching some writes, that's a different thing
that writer still sleeps more than 120 seconds because there are way
too many readers.
May be we need to look into show workload tree scheduling takes place and
tweak that logic a bit.
Looking at backtraces should help.
On my system with XFS filesystem I ran 32 readers and 16 buffered writers
with fio for 180 seconds. Without the patch I was getting hung task
timeout warning and with the patch I stopped getting that. I also ran
the blktrace and saw that roughly in 4 seconds we got to dispatch a write.
Which is much better than complete write starving.
So basically blktrace will help here.
Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists