[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110908134945.GA7024@redhat.com>
Date: Thu, 8 Sep 2011 09:49:45 -0400
From: Vivek Goyal <vgoyal@...hat.com>
To: Takuya Yoshikawa <yoshikawa.takuya@....ntt.co.jp>
Cc: linux-kernel@...r.kernel.org, qemu-devel@...gnu.org,
kvm@...r.kernel.org, axboe@...nel.dk, takuya.yoshikawa@...il.com
Subject: Re: CFQ I/O starvation problem triggered by RHEL6.0 KVM guests
On Thu, Sep 08, 2011 at 06:13:53PM +0900, Takuya Yoshikawa wrote:
> This is a report of strange cfq behaviour which seems to be triggered by
> QEMU posix aio threads.
>
> Host environment:
> OS: RHEL6.0 KVM/qemu-kvm (with no patch applied)
> IO scheduler: cfq (with the default parameters)
So you are using both RHEL 6.0 in both host and guest kernel? Can you
reproduce the same issue with upstream kernels? How easily/frequently
you can reproduce this with RHEL6.0 host.
>
> On the host, we were running 3 linux guests to see if I/O from these guests
> would be handled fairly by host; each guest did dd write with oflag=direct.
>
> Guest virtual disk:
> We used a host local disk which had 3 partitions, and each guest was
> allocated one of these as dd write target.
>
> So our test was for checking if cfq could keep fairness for the 3 guests
> who shared the same disk.
>
> The result (strage starvation):
> Sometimes, one guest dominated cfq for more than 10sec and requests from
> other guests were not handled at all during that time.
>
> Below is the blktrace log which shows that a request to (8,27) in cfq2068S (*1)
> is not handled at all during cfq2095S and cfq2067S which hold requests to
> (8,26) are being handled alternately.
>
> *1) WS 104920578 + 64
>
> Question:
> I guess that cfq_close_cooperator() was being called in an unusual manner.
> If so, do you think that cfq is responsible for keeping fairness for this
> kind of unusual write requests?
- If two guests are doing IO to separate partitions, they should really
not be very close (until and unless partitions are really small).
- Even if there are close cooperators, these queues are merged and they
are treated as single queue from slice point of view. So cooperating
queues should be merged and get a single slice instead of starving
other queues in the system.
Can you upload the blktrace logs somewhere which shows what happened
during that 10 seconds.
>
> Note:
> With RHEL6.1, this problem could not triggered. But I guess that was due to
> QEMU's block layer updates.
You can try reproducing this with fio.
Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists