lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 29 Jun 2010 11:06:19 +0200
From:	Corrado Zoccolo <czoccolo@...il.com>
To:	Jeff Moyer <jmoyer@...hat.com>
Cc:	Vivek Goyal <vgoyal@...hat.com>, Christoph Hellwig <hch@....de>,
	Jens Axboe <axboe@...nel.dk>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: trying to understand READ_META, READ_SYNC, WRITE_SYNC & co

On Sun, Jun 27, 2010 at 5:44 PM, Jeff Moyer <jmoyer@...hat.com> wrote:
> Vivek Goyal <vgoyal@...hat.com> writes:
>
>> On Fri, Jun 25, 2010 at 01:03:20PM +0200, Christoph Hellwig wrote:
>>> On Wed, Jun 23, 2010 at 09:44:20PM -0400, Vivek Goyal wrote:
>>> I see the point of this logic for reads where various workloads have
>>> dependent reads that might be close to each other, but I don't really
>>> see any point for writes.
>>>
>>> > So looks like fsync path will do bunch of IO and then will wait for jbd thread
>>> > to finish the work. In this case idling is waste of time.
>>>
>>> Given that ->writepage already does WRITE_SYNC_PLUG I/O which includes
>>> REQ_NODILE I'm still confused why we still have that issue.
>>
>> In current form, cfq honors REQ_NOIDLE conditionally and that's why we
>> still have the issue. If you look at cfq_completed_request(), we continue
>> to idle in following two cases.
>>
>> - If we classifed the queue as SYNC_WORKLOAD.
>> - If there is another random read/write happening on sync-noidle service
>>   tree.
>>
>> SYNC_WORKLOAD means that cfq thinks this particular queue is doing sequential
>> IO. For random IO queues, we don't idle on each individual queue but a
>> group of queue.
>>
>> In jeff's testing, fsync thread/queue sometimes is viewed as sequential
>> workload and goes on SYNC_WORKLOAD tree. In that case even if request is
>> REQ_NOIDLE, we will continue to idle hence fsync issue.
>
> I'm now testing OCFS2, and I'm seeing performance that is not great
> (even with the blk_yield patches applied).  What happens is that we
> successfully yield the queue to the journal thread, but then idle on the
> journal thread (even though RQ_NOIDLE was set).
>
> So, can we just get rid of idling when RQ_NOIDLE is set?
Hi Jeff,
I think I spotted a problem with the initial implementation of the
tree-wide idle when RQ_NOIDLE is set: I assumed that a queue would
either send possibly-idling requests or no-idle requests, but it seems
that RQ_NOIDLE is being used to mark the end of a stream of
possibly-idling requests (in my initial implementation, this will then
cause an unintended idle). The attached patch should fix it, and I
think the logic is simpler than Vivek's. Can you give it a spin?
Otherwise, I think that reverting the "noidle_tree_requires_idle"
behaviour completely may be better than adding complexity, since it is
really trying to solve corner cases (that maybe happen only on
synthetic workloads), but affecting negatively more common cases.

About what it is trying to solve, since I think it was not clear:
- we have a workload of 2 queues, both issuing requests that are being
put in the no-idle tree (e.g. they are random) + 1 queue issuing
idling requests (e.g. sequential).
- if one of the 2 "random" queues marks its requests as RQ_NOIDLE,
then the timeslice for the no-idle tree is not preserved, causing
unfairness, as soon as an RQ_NOIDLE request is serviced and the tree
is empty.

Thanks
Corrado

>
> Vivek sent me this patch to test, and it got rid of the performance
> issue for the fsync workload.  Can we discuss its merits?
>
> Thanks,
> Jeff
>
> Index: linux-2.6/block/cfq-iosched.c
> ===================================================================
> --- linux-2.6.orig/block/cfq-iosched.c  2010-06-25 15:57:33.832125786 -0400
> +++ linux-2.6/block/cfq-iosched.c       2010-06-25 15:59:19.788876361 -0400
> @@ -318,6 +318,7 @@
>        CFQ_CFQQ_FLAG_split_coop,       /* shared cfqq will be splitted */
>        CFQ_CFQQ_FLAG_deep,             /* sync cfqq experienced large depth */
>        CFQ_CFQQ_FLAG_wait_busy,        /* Waiting for next request */
> +       CFQ_CFQQ_FLAG_group_idle,       /* This queue is doing group idle */
>  };
>
>  #define CFQ_CFQQ_FNS(name)                                             \
> @@ -347,6 +348,7 @@
>  CFQ_CFQQ_FNS(split_coop);
>  CFQ_CFQQ_FNS(deep);
>  CFQ_CFQQ_FNS(wait_busy);
> +CFQ_CFQQ_FNS(group_idle);
>  #undef CFQ_CFQQ_FNS
>
>  #ifdef CONFIG_CFQ_GROUP_IOSCHED
> @@ -1613,6 +1615,7 @@
>
>        cfq_clear_cfqq_wait_request(cfqq);
>        cfq_clear_cfqq_wait_busy(cfqq);
> +       cfq_clear_cfqq_group_idle(cfqq);
>
>        /*
>         * If this cfqq is shared between multiple processes, check to
> @@ -3176,6 +3179,13 @@
>        if (cfq_class_rt(new_cfqq) && !cfq_class_rt(cfqq))
>                return true;
>
> +       /*
> +        * If were doing group_idle and we got new request in same group,
> +        * preempt the queue
> +        */
> +       if (cfq_cfqq_group_idle(cfqq))
> +               return true;
> +
>        if (!cfqd->active_cic || !cfq_cfqq_wait_request(cfqq))
>                return false;
>
> @@ -3271,6 +3281,7 @@
>        struct cfq_queue *cfqq = RQ_CFQQ(rq);
>
>        cfq_log_cfqq(cfqd, cfqq, "insert_request");
> +       cfq_clear_cfqq_group_idle(cfqq);
>        cfq_init_prio_data(cfqq, RQ_CIC(rq)->ioc);
>
>        rq_set_fifo_time(rq, jiffies + cfqd->cfq_fifo_expire[rq_is_sync(rq)]);
> @@ -3416,10 +3427,12 @@
>                         * SYNC_NOIDLE_WORKLOAD idles at the end of the tree
>                         * only if we processed at least one !rq_noidle request
>                         */
> -                       if (cfqd->serving_type == SYNC_WORKLOAD
> -                           || cfqd->noidle_tree_requires_idle
> -                           || cfqq->cfqg->nr_cfqq == 1)
> +                       if (cfqd->noidle_tree_requires_idle)
> +                               cfq_arm_slice_timer(cfqd);
> +                       else if (cfqq->cfqg->nr_cfqq == 1) {
> +                               cfq_mark_cfqq_group_idle(cfqq);
>                                cfq_arm_slice_timer(cfqd);
> +                       }
>                }
>        }
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



-- 
__________________________________________________________________________

dott. Corrado Zoccolo                          mailto:czoccolo@...il.com
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------
The self-confidence of a warrior is not the self-confidence of the average
man. The average man seeks certainty in the eyes of the onlooker and calls
that self-confidence. The warrior seeks impeccability in his own eyes and
calls that humbleness.
                               Tales of Power - C. Castaneda



Hi

View attachment "0001-cfq-iosched-fix-tree-wide-handling-of-rq_noidle.patch" of type "text/x-patch" (2252 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ