[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110614133047.GA2525@redhat.com>
Date: Tue, 14 Jun 2011 09:30:47 -0400
From: Vivek Goyal <vgoyal@...hat.com>
To: Tao Ma <tm@....ma>
Cc: linux-kernel@...r.kernel.org, Jens Axboe <axboe@...nel.dk>
Subject: Re: CFQ: async queue blocks the whole system
On Tue, Jun 14, 2011 at 03:03:24PM +0800, Tao Ma wrote:
> Hi Vivek,
> On 06/14/2011 05:41 AM, Vivek Goyal wrote:
> > On Mon, Jun 13, 2011 at 06:08:40PM +0800, Tao Ma wrote:
> >
> > [..]
> >>> You can also run iostat on disk and should be able to see that with
> >>> the patch you are dispatching writes more often than before.
> >> Sorry, the patch doesn't work.
> >>
> >> I used trace event to capture all the blktraces since it doesn't
> >> interfere with the tests, hope it helps.
> >
> > Actually I was looking for CFQ traces. This seems to be generic block
> > layer trace points. May be you can use "blktrace -d /dev/<device>"
> > and then blkparse. It also gives the aggregate view which is helpful.
> >
> >>
> >> Please downloaded it from http://blog.coly.li/tmp/blktrace.tar.bz2
> >
> > What concerns me is following.
> >
> > 5255.521353: block_rq_issue: 8,0 W 0 () 571137153 + 8 [attr_set]
> > 5578.863871: block_rq_issue: 8,0 W 0 () 512950473 + 48 [kworker/0:1]
> >
> > IIUC, we dispatched second write more than 300 seconds after dispatching
> > 1 write. What happened in between. We should have dispatched more writes.
> >
> > CFQ traces might give better idea in terms of whether wl_type for async
> > queues was scheduled or not at all.
> I tried several times today, but it looks like that if I enable
> blktrace, the hung_task will not show up in the message. So do you think
> the blktrace at that time is still useful? If yes, I can capture 1
> minute for you. Thanks.
Capturing 1 min output will also be good.
You can do one more thing. Mount block IO controller. It has the stats for
sync and async dispatch (blkio.io_serviced or blkio.io_service_bytes). You
can write a simple script to read and print these files every few seconds.
That will also tell whether CFQ is dispatching async requests for the
said device regularly or not.
So both blktrace and blkio controller stat will help.
Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists