[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKA=qzYea71A5VQOQgrqzaZtVFadjhcghgH+Z9n4ivS7vUgW0g@mail.gmail.com>
Date: Fri, 22 Jun 2012 16:34:07 -0500
From: Josh Hunt <joshhunt00@...il.com>
To: Vivek Goyal <vgoyal@...hat.com>
Cc: Jens Axboe <axboe@...nel.dk>, linux-kernel@...r.kernel.org,
tj@...nel.org
Subject: Re: multi-second application stall in open()
On Fri, Jun 22, 2012 at 3:57 PM, Josh Hunt <joshhunt00@...il.com> wrote:
> On Fri, Jun 22, 2012 at 3:53 PM, Josh Hunt <joshhunt00@...il.com> wrote:
>> On Fri, Jun 22, 2012 at 3:42 PM, Vivek Goyal <vgoyal@...hat.com> wrote:
>>> On Fri, Jun 22, 2012 at 03:22:01PM -0500, Josh Hunt wrote:
>>>
>>> [..]
>>>> It looks like my statement about busy_queues isn't entirely accurate.
>>>> I'm seeing this later on in the trace:
>>>
>>> Yes it is being called some times. But it is interesting to see that in
>>> the instance you pasted in last mail, we did not see "schedule_dispatch"
>>> message. Are you sure that other requests were queued at that point of
>>> time and there were busy_queues? You can try printing cfqd->rq_queued
>>> to verify that.
>>>
>>> Also put some message in cfq_select_queue() to see why it did not select
>>> a new queue before the stall.
>>>
>>> Thanks
>>> Vivek
>>
>> I think you may be correct that at those points in time there weren't
>> any busy queues. I do know that there were requests waiting during
>> this time:
>
> Sorry. Please disregard the prev mail. Looks like I must have scrolled
> my buffer while pasting in the traces... Here's what I meant to send:
>
> Insert:
> 8,0 1 296677 4847.796879805 6076 P N [flush-8:0]
> 8,0 1 296678 4847.796881202 6076 I W 32800441 + 8 (
> 2794) [flush-8:0]
> 8,0 1 0 4847.796883157 0 m N cfq5917 insert_request
> 8,0 1 296679 4847.796884834 6076 U N [flush-8:0] 1
> 8,0 1 296680 4847.796901037 6076 A R 32499249 + 8 <- (8,3) 11534424
> 8,0 1 296681 4847.796902154 6076 Q R 32499249 + 8 [flush-8:0]
> 8,0 1 296682 4847.796904389 6076 G R 32499249 + 8 [flush-8:0]
> 8,0 1 296683 4847.796906066 6076 I R 32499249 + 8 (
> 1677) [flush-8:0]
> 8,0 1 0 4847.796907462 0 m N cfq6076 insert_request
> 8,0 1 0 4847.796908859 0 m N cfq6076 add_to_rr
>
>
> Stall:
> 8,0 1 0 4847.816870313 0 m N cfq7646 insert_request
> 8,0 1 0 4847.816873665 0 m N cfq7646 del timer
> 8,0 1 0 4847.816874783 0 m N cfq7646 update idle time stats
> 8,0 1 0 4847.816876738 0 m N cfq7646 select_queue:
> active requests/not expired - keep_queue
> 8,0 1 0 4847.816878414 0 m N cfq7646 Idling on
> sync. st->count:1
> 8,0 1 0 4847.816879811 0 m N cfq7646 dispatch_insert
> 8,0 1 0 4847.816882046 0 m N cfq7646 remove request
> 8,0 1 0 4847.816884002 0 m N cfq7646 dispatched a request
> 8,0 1 0 4847.816885399 0 m N cfq7646 activate rq, drv=1
> 8,0 1 296748 4847.816887075 7646 D R 32499241 + 8 (
> 20115) [httpdmon]
> 8,0 1 0 4847.816945463 0 m N cfq7646 Idling on
> sync. st->count:1
> 8,0 1 0 4847.816946580 0 m N cfq7646 select_queue:
> dispatched and should_idle - keep_queue
> 8,0 1 0 4847.816948536 0 m N cfq7646 Idling on
> sync. st->count:1
> 8,0 1 0 4847.816949653 0 m N cfq7646 select_queue:
> dispatched and should_idle - keep_queue
> 8,0 0 440261 4847.817104144 7284 C R 32499241 + 8 ( 217069) [0]
> 8,0 0 0 4847.817110290 0 m N cfq7646 complete rqnoidle 0
> 8,0 0 0 4847.817112525 0 m N cfq7646 arm slice timer
> 8,0 0 0 4847.817113922 0 m N cfq7646 Idling on
> sync. st->count:1
> 8,0 0 0 4847.817115878 0 m N cfq7646 arm_idle: 8
> group_idle: 0
> 8,0 0 0 4847.817116995 0 m N cfq schedule dispatch
> 8,0 0 0 4847.817120068 0 m N cfq7646 select_queue:
> timer_pending - keep_queue
> 8,0 0 0 4847.824613293 0 m N cfq idle timer fired
> 8,0 0 0 4847.824616366 0 m N cfq7646
> idle_slice_timer: active queue
> 8,0 0 0 4847.824617763 0 m N cfq7646
> idle_slice_timer: clear_cfqq_deep
> 8,0 0 0 4847.824619439 0 m N cfq7646 slice expired t=0
> 8,0 0 0 4847.824621395 0 m N cfq7646 del timer
> 8,0 0 0 4847.824622792 0 m N cfq7646 update idle time stats
> 8,0 0 0 4847.824625865 0 m N cfq7646 sl_used=12
> disp=3 charge=12 iops=0 sect=24
> 8,0 0 0 4847.824627262 0 m N cfq7646 slice_expired
> del cfqq from rr
> 8,0 0 0 4847.824628379 0 m N cfq7646 del_from_rr
> 8,0 0 0 4847.824629776 0 m N cfq7646 slice_expired
> resort_rr_list
> 8,0 0 0 4847.824631452 0 m N cfq7646 slice_expired
> setting active_queue to NULL
> 8,0 0 0 4847.824632849 0 m N cfq7646 slice_expired
> setting active_cic to NULL
> 8,0 0 0 4847.824634246 0 m N cfq schedule dispatch
> 8,0 1 296749 4848.309940967 7646 A R 33545872 + 8 <- (8,4) 2152
>
> Dispatch:
> 8,0 0 0 4857.832309642 0 m N cfq5917 dispatched a request
> 8,0 0 0 4857.832310760 0 m N cfq5917 activate rq, drv=2
> 8,0 0 440336 4857.832312157 7284 D W 32800441 + 8
> (10035430955) [testd]
> 8,0 0 440337 4857.832447371 7284 C W 2076271 + 8 ( 687247) [0]
> 8,0 0 0 4857.832457987 0 m N cfq5917 complete rqnoidle 0
>
> --
> Josh
This doesn't seem right to me, but I could be misunderstanding
something. The stall I pasted in the previous mail shows an insert to
cfq5917:
> 8,0 1 296678 4847.796881202 6076 I W 32800441 + 8 (
> 2794) [flush-8:0]
> 8,0 1 0 4847.796883157 0 m N cfq5917 insert_request
I then grepped my previously attached log file for changes to rr for
this queue from the time of the insert to the dispatch:
8,0 1 0 4847.030323122 0 m N cfq5917 add_to_rr
8,0 1 0 4847.030333459 0 m N cfq5917 slice_expired
resort_rr_list
8,0 1 0 4847.797539953 0 m N cfq5917 slice_expired
del cfqq from rr
8,0 1 0 4847.797541070 0 m N cfq5917 del_from_rr
8,0 1 0 4847.797542467 0 m N cfq5917 slice_expired
resort_rr_list
8,0 1 0 4847.802484220 0 m N cfq5917 add_to_rr
8,0 0 0 4847.808710791 0 m N cfq5917 slice_expired
del cfqq from rr
8,0 0 0 4847.808711908 0 m N cfq5917 del_from_rr
8,0 0 0 4847.808713305 0 m N cfq5917 slice_expired
resort_rr_list
8,0 1 0 4847.809024801 0 m N cfq5917 add_to_rr
8,0 1 0 4847.810320233 0 m N cfq5917 slice_expired
del cfqq from rr
8,0 1 0 4847.810321351 0 m N cfq5917 del_from_rr
8,0 1 0 4847.810322748 0 m N cfq5917 slice_expired
resort_rr_list
8,0 1 0 4847.810447905 0 m N cfq5917 add_to_rr
8,0 1 0 4847.812490649 0 m N cfq5917 slice_expired
del cfqq from rr
8,0 1 0 4847.812492046 0 m N cfq5917 del_from_rr
8,0 1 0 4847.812493442 0 m N cfq5917 slice_expired
resort_rr_list
8,0 1 0 4852.819290238 0 m N cfq5917 slice_expired
resort_rr_list
8,0 0 0 4857.832506038 0 m N cfq5917 slice_expired
del cfqq from rr
8,0 0 0 4857.832507156 0 m N cfq5917 del_from_rr
8,0 0 0 4857.832508553 0 m N cfq5917 slice_expired
resort_rr_list
8,0 1 0 4857.832548782 0 m N cfq5917 add_to_rr
8,0 0 0 4857.832843796 0 m N cfq5917 slice_expired
del cfqq from rr
8,0 0 0 4857.832844913 0 m N cfq5917 del_from_rr
8,0 0 0 4857.832846310 0 m N cfq5917 slice_expired
resort_rr_list
8,0 1 0 4857.833060306 0 m N cfq5917 add_to_rr
Here's the dispatch for the request above:
> 8,0 0 0 4857.832310760 0 m N cfq5917 activate rq, drv=2
> 8,0 0 440336 4857.832312157 7284 D W 32800441 + 8
> (10035430955) [testd]
Shouldn't the queue stay on the RR list until it is empty?
--
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists