In some drives, flush requests are non-queueable. When flush request is running, normal read/write requests can't run. If block layer dispatches such request, driver can't handle it and requeue it. Tejun suggested we can hold the queue when flush is running. This can avoid unnecessary requeue. Also this can improve performance. Say we have requests f1, w1, f2 (f is flush request, and w is write request). When f1 is running, queue will be hold, so w1 will not be added to queue list. Just after f1 is finished, f2 will be dispatched. Since f1 already flushs cache out, f2 can be finished very quickly. In my test, the queue holding completely solves a regression introduced by commit 53d63e6b0dfb9588, which is about 20% regression running a sysbench fileio workload. Signed-off-by: Shaohua Li --- block/blk-flush.c | 3 +++ block/blk.h | 12 +++++++++++- include/linux/blkdev.h | 1 + 3 files changed, 15 insertions(+), 1 deletion(-) Index: linux/block/blk-flush.c =================================================================== --- linux.orig/block/blk-flush.c 2011-05-04 14:20:33.000000000 +0800 +++ linux/block/blk-flush.c 2011-05-04 15:23:50.000000000 +0800 @@ -199,6 +199,9 @@ static void flush_end_io(struct request BUG_ON(q->flush_pending_idx == q->flush_running_idx); + queued |= q->flush_queue_delayed; + q->flush_queue_delayed = 0; + /* account completion of the flush request */ q->flush_running_idx ^= 1; elv_completed_request(q, flush_rq); Index: linux/include/linux/blkdev.h =================================================================== --- linux.orig/include/linux/blkdev.h 2011-05-04 14:24:40.000000000 +0800 +++ linux/include/linux/blkdev.h 2011-05-04 14:29:29.000000000 +0800 @@ -365,6 +365,7 @@ struct request_queue */ unsigned int flush_flags; unsigned int flush_not_queueable:1; + unsigned int flush_queue_delayed:1; unsigned int flush_pending_idx:1; unsigned int flush_running_idx:1; unsigned long flush_pending_since; Index: linux/block/blk.h =================================================================== --- linux.orig/block/blk.h 2011-05-04 14:20:33.000000000 +0800 +++ linux/block/blk.h 2011-05-04 16:09:42.000000000 +0800 @@ -61,7 +61,17 @@ static inline struct request *__elv_next rq = list_entry_rq(q->queue_head.next); return rq; } - + /* + * Flush request is running and flush request isn't queeueable + * in the drive, we can hold the queue till flush request is + * finished. Even we don't do this, driver can't dispatch next + * requests and will requeue them. + */ + if (q->flush_pending_idx != q->flush_running_idx && + !blk_queue_flush_queueable(q)) { + q->flush_queue_delayed = 1; + return NULL; + } if (!q->elevator->ops->elevator_dispatch_fn(q, 0)) return NULL; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/