lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e98e18940907021309u1f784b3at409b55ba46ed108c@mail.gmail.com>
Date:	Thu, 2 Jul 2009 13:09:14 -0700
From:	Nauman Rafique <nauman@...gle.com>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	linux-kernel@...r.kernel.org,
	containers@...ts.linux-foundation.org, dm-devel@...hat.com,
	jens.axboe@...cle.com, dpshah@...gle.com, lizf@...fujitsu.com,
	mikew@...gle.com, fchecconi@...il.com, paolo.valente@...more.it,
	ryov@...inux.co.jp, fernando@....ntt.co.jp, s-uchida@...jp.nec.com,
	taka@...inux.co.jp, guijianfeng@...fujitsu.com, jmoyer@...hat.com,
	dhaval@...ux.vnet.ibm.com, balbir@...ux.vnet.ibm.com,
	righi.andrea@...il.com, m-ikeda@...jp.nec.com, jbaron@...hat.com,
	agk@...hat.com, snitzer@...hat.com, akpm@...ux-foundation.org,
	peterz@...radead.org
Subject: Re: [PATCH 13/25] io-controller: Wait for requests to complete from 
	last queue before new queue is scheduled

On Thu, Jul 2, 2009 at 1:01 PM, Vivek Goyal<vgoyal@...hat.com> wrote:
> o Currently one can dispatch requests from multiple queues to the disk. This
>  is true for hardware which supports queuing. So if a disk support queue
>  depth of 31 it is possible that 20 requests are dispatched from queue 1
>  and then next queue is scheduled in which dispatches more requests.
>
> o This multiple queue dispatch introduces issues for accurate accounting of
>  disk time consumed by a particular queue. For example, if one async queue
>  is scheduled in, it can dispatch 31 requests to the disk and then it will
>  be expired and a new sync queue might get scheduled in. These 31 requests
>  might take a long time to finish but this time is never accounted to the
>  async queue which dispatched these requests.
>
> o This patch introduces the functionality where we wait for all the requests
>  to finish from previous queue before next queue is scheduled in. That way
>  a queue is more accurately accounted for disk time it has consumed. Note
>  this still does not take care of errors introduced by disk write caching.
>
> o Because above behavior can result in reduced throughput, this behavior will
>  be enabled only if user sets "fairness" tunable to 2 or higher.

Vivek,
Did you collect any numbers for the impact on throughput from this
patch? It seems like with this change, we can even support NCQ.

>
> o This patch helps in achieving more isolation between reads and buffered
>  writes in different cgroups. buffered writes typically utilize full queue
>  depth and then expire the queue. On the contarary, sequential reads
>  typicaly driver queue depth of 1. So despite the fact that writes are
>  using more disk time it is never accounted to write queue because we don't
>  wait for requests to finish after dispatching these. This patch helps
>  do more accurate accounting of disk time, especially for buffered writes
>  hence providing better fairness hence better isolation between two cgroups
>  running read and write workloads.
>
> Signed-off-by: Vivek Goyal <vgoyal@...hat.com>
> ---
>  block/elevator-fq.c |   31 ++++++++++++++++++++++++++++++-
>  1 files changed, 30 insertions(+), 1 deletions(-)
>
> diff --git a/block/elevator-fq.c b/block/elevator-fq.c
> index 68be1dc..7609579 100644
> --- a/block/elevator-fq.c
> +++ b/block/elevator-fq.c
> @@ -2038,7 +2038,7 @@ STORE_FUNCTION(elv_slice_sync_store, &efqd->elv_slice[1], 1, UINT_MAX, 1);
>  EXPORT_SYMBOL(elv_slice_sync_store);
>  STORE_FUNCTION(elv_slice_async_store, &efqd->elv_slice[0], 1, UINT_MAX, 1);
>  EXPORT_SYMBOL(elv_slice_async_store);
> -STORE_FUNCTION(elv_fairness_store, &efqd->fairness, 0, 1, 0);
> +STORE_FUNCTION(elv_fairness_store, &efqd->fairness, 0, 2, 0);
>  EXPORT_SYMBOL(elv_fairness_store);
>  #undef STORE_FUNCTION
>
> @@ -2952,6 +2952,24 @@ void *elv_fq_select_ioq(struct request_queue *q, int force)
>        }
>
>  expire:
> +       if (efqd->fairness >= 2 && !force && ioq && ioq->dispatched) {
> +               /*
> +                * If there are request dispatched from this queue, don't
> +                * dispatch requests from new queue till all the requests from
> +                * this queue have completed.
> +                *
> +                * This helps in attributing right amount of disk time consumed
> +                * by a particular queue when hardware allows queuing.
> +                *
> +                * Set ioq = NULL so that no more requests are dispatched from
> +                * this queue.
> +                */
> +               elv_log_ioq(efqd, ioq, "select: wait for requests to finish"
> +                               " disp=%lu", ioq->dispatched);
> +               ioq = NULL;
> +               goto keep_queue;
> +       }
> +
>        elv_ioq_slice_expired(q);
>  new_queue:
>        ioq = elv_set_active_ioq(q, new_ioq);
> @@ -3109,6 +3127,17 @@ void elv_ioq_completed_request(struct request_queue *q, struct request *rq)
>                                 */
>                                elv_ioq_arm_slice_timer(q, 1);
>                        } else {
> +                               /* If fairness >=2 and there are requests
> +                                * dispatched from this queue, don't dispatch
> +                                * new requests from a different queue till
> +                                * all requests from this queue have finished.
> +                                * This helps in attributing right disk time
> +                                * to a queue when hardware supports queuing.
> +                                */
> +
> +                               if (efqd->fairness >= 2 && ioq->dispatched)
> +                                       goto done;
> +
>                                /* Expire the queue */
>                                elv_ioq_slice_expired(q);
>                        }
> --
> 1.6.0.6
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ