lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKA=qzYoX-1nTZ6QfdD7RhWf=KU7J6tWOJXgt-F+Kz2c3Xn0TQ@mail.gmail.com>
Date:	Tue, 26 Jun 2012 15:37:31 -0500
From:	Josh Hunt <joshhunt00@...il.com>
To:	Vivek Goyal <vgoyal@...hat.com>, Tejun Heo <tj@...nel.org>
Cc:	Jens Axboe <axboe@...nel.dk>, linux-kernel@...r.kernel.org
Subject: Re: multi-second application stall in open()

On Tue, Jun 26, 2012 at 10:53 AM, Vivek Goyal <vgoyal@...hat.com> wrote:
> On Tue, Jun 26, 2012 at 10:18:04AM -0500, Josh Hunt wrote:
>> On Tue, Jun 26, 2012 at 7:59 AM, Vivek Goyal <vgoyal@...hat.com> wrote:
>> > On Mon, Jun 25, 2012 at 11:01:48PM -0500, Josh Hunt wrote:
>> >
>> > [..]
>> >> So this really seems like a problem with kblockd not kicking in. I've
>> >> instrumented every path in select_queue and it's not getting hit after
>> >> schedule dispatch. Everything seems to stall at that point until a new
>> >> request comes in.
>> >
>> > Ok, that's cool. So now we need to find out why queued work is not being
>> > scheduled.
>> >
>> > I think there are some workqueue related trace points. If you enable those
>> > along with blktraces, that should give tejun some data to look at.
>> >
>> > Thanks
>> > Vivek
>>
>> Tejun
>>
>> Do you have any suggestions on how to debug this?
>>
>> I did "perf record -a -e workqueue:*" and grabbed some tracepoint
>> data, but it's hard to correlate when these events are occurring in
>> the blktrace logs. Will keep investigating.
>
> If you capture blktrace logs through trace_pipe and not blktrace
> tool, you will get both workqueue and block traces with time stamps and
> then correlating these becomes easier. So just enable "blk" tracer, enable
> tracing on that particular device and then enable certaion workqueue
> related trace events and capture trace_pipe output.
>
> thanks
> Vivek

Thanks Vivek! Always nice to learn some new debugging tricks! :)

I don't know if these will paste in very well so I'm attaching some
snippets from the log I captured as well. Here is the bad case:

           testd-7261  [000]  9527.900483:   8,0    m   N cfq14995S
complete rqnoidle 0
           testd-7261  [000]  9527.900485:   8,0    m   N cfq14995S
arm slice timer
           testd-7261  [000]  9527.900486:   8,0    m   N cfq14995S
Idling. st->count:1, hw_tag:0
           testd-7261  [000]  9527.900488:   8,0    m   N cfq14995S
arm_idle: 8 group_idle: 0
           testd-7261  [000]  9527.900489:   8,0    m   N cfq schedule
dispatch: busy_queues:1 rq_queued:0 rq_in_driver:0 rq_in_flight[0]:0
[1]:0
           testd-7261  [000]  9527.900494:   8,0    m   N cfq
select_queue: !rq_queued
           testd-7261  [000]  9527.907998:   8,0    m   N cfq idle timer fired
           testd-7261  [000]  9527.908003:   8,0    m   N cfq14995S
idle_slice_timer: active queue
           testd-7261  [000]  9527.908004:   8,0    m   N cfq14995S
idle_slice_timer: clear_cfqq_deep
           testd-7261  [000]  9527.908006:   8,0    m   N cfq14995S
slice expired t=0
           testd-7261  [000]  9527.908007:   8,0    m   N cfq14995S del timer
           testd-7261  [000]  9527.908008:   8,0    m   N cfq14995S
update idle time stats
           testd-7261  [000]  9527.908011:   8,0    m   N cfq14995S
sl_used=9 disp=2 charge=9 iops=0 sect=16
           testd-7261  [000]  9527.908012:   8,0    m   N cfq14995S
slice_expired del cfqq from rr - on_rr:1 rb empty:1
           testd-7261  [000]  9527.908014:   8,0    m   N cfq14995S
del_from_rr queued[0]:0 queued[1]:0
           testd-7261  [000]  9527.908015:   8,0    m   N cfq14995S
slice_expired resort_rr_list
           testd-7261  [000]  9527.908016:   8,0    m   N cfq14995S
slice_expired setting active_queue to NULL
           testd-7261  [000]  9527.908017:   8,0    m   N cfq14995S
slice_expired setting active_cic to NULL
           testd-7261  [000]  9527.908019:   8,0    m   N cfq schedule
dispatch: busy_queues:1 rq_queued:1 rq_in_driver:0 rq_in_flight[0]:0
[1]:0
       flush-8:0-6082  [001]  9528.075769:   8,0    A   W 28304865 + 8
<- (8,3) 7340040
       flush-8:0-6082  [001]  9528.075773: block_bio_remap: 8,0 W
28304865 + 8 <- (8,3) 7340040

The print "cfq schedule dispatch: busy_queues:1 rq_queued:1
rq_in_driver:0 rq_in_flight[0]:0 [1]:0" is inside the if to call
kblockd_schedule_work() so it is getting called, but you can see that
we never see the workqueue_queue_work tracepoint being hit.

Here's what it looks like in the good case:

          <idle>-0     [000]  9523.042437:   8,0    m   N cfq idle timer fired
          <idle>-0     [000]  9523.042441:   8,0    m   N cfq6082S
idle_slice_timer: active queue
          <idle>-0     [000]  9523.042442:   8,0    m   N cfq6082S
idle_slice_timer: clear_cfqq_deep
          <idle>-0     [000]  9523.042443:   8,0    m   N cfq6082S
slice expired t=0
          <idle>-0     [000]  9523.042444:   8,0    m   N cfq6082S del timer
          <idle>-0     [000]  9523.042445:   8,0    m   N cfq6082S
update idle time stats
          <idle>-0     [000]  9523.042447:   8,0    m   N cfq6082S
sl_used=8 disp=1 charge=8 iops=0 sect=8
          <idle>-0     [000]  9523.042449:   8,0    m   N cfq6082S
slice_expired del cfqq from rr - on_rr:1 rb empty:1
          <idle>-0     [000]  9523.042450:   8,0    m   N cfq6082S
del_from_rr queued[0]:0 queued[1]:0
          <idle>-0     [000]  9523.042451:   8,0    m   N cfq6082S
slice_expired resort_rr_list
          <idle>-0     [000]  9523.042452:   8,0    m   N cfq6082S
slice_expired setting active_queue to NULL
          <idle>-0     [000]  9523.042453:   8,0    m   N cfq6082S
slice_expired setting active_cic to NULL
          <idle>-0     [000]  9523.042455:   8,0    m   N cfq schedule
dispatch: busy_queues:1 rq_queued:1 rq_in_driver:0 rq_in_flight[0]:0
[1]:0
          <idle>-0     [000]  9523.042456: workqueue_queue_work: work
struct=ffff880108db8a48 function=cfq_kick_queue
workqueue=ffff88010cf5e800 req_cpu=0 cpu=0
          <idle>-0     [000]  9523.042456: workqueue_activate_work:
work struct ffff880108db8a48
     kworker/0:1-12940 [000]  9523.042462: workqueue_execute_start:
work struct ffff880108db8a48: function cfq_kick_queue
     kworker/0:1-12940 [000]  9523.042462:   8,0    m   N cfq kick queue
     kworker/0:1-12940 [000]  9523.042464:   8,0    m   N cfq
select_queue: !cfqq
     kworker/0:1-12940 [000]  9523.042464:   8,0    m   N cfq
select_queue: cfq_choose_cfqg
     kworker/0:1-12940 [000]  9523.042466:   8,0    m   N cfq workload slice:40
     kworker/0:1-12940 [000]  9523.042466:   8,0    m   N cfq
select_queue: setting active queue to new_cfqq:NULL
     kworker/0:1-12940 [000]  9523.042468:   8,0    m   N cfq5979A
set_active wl_prio:0 wl_type:0

Tejun - if you could let me know what other data I should be looking
for/need to enable in the traces I would appreciate it.

Thanks!
-- 
Josh

Download attachment "workqueue.trace" of type "application/octet-stream" (255407 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ