lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 22 Apr 2013 09:44:23 +0400
From:	Azat Khuzhin <dohardgopro@...il.com>
To:	linux-ext4@...r.kernel.org
Subject: Re: ext4, jbd2: task blocked for more than 120 seconds.

On Mon, Apr 22, 2013 at 9:40 AM, Azat Khuzhin <dohardgopro@...il.com> wrote:
> Here is more that I tried to do:
>
> $ umount /dev/sdb5 # finished successfully

$ umount -l /dev/sdb5

> $ debugfs
> open_filesys /dev/sdb5 # finished successfully
> # Then I copy some files that I need at that moment to another partition
> # After I try to mount it
> $ mount /dev/sdb5 # didn't finished
> root:/sys# cat /proc/$(pgrep mount)/status | grep State
> State:  D (disk sleep)
>
> On Sat, Apr 20, 2013 at 8:05 PM, Azat Khuzhin <dohardgopro@...il.com> wrote:
>> Today I have "blocked for more than 120 seconds" in dmesg, while doing
>> next things:
>>
>> - Run ktorrent
>> - Download torrent, and don't have enough space to save
>> - mv of 84 GiB folder into another parititon (not waiting for the end)
>> - Continue downloading with ktorrent
>> - When there is only 4.0K available on partition (where ktorrent
>> saves), all jobs hangs
>>   - ktorrent
>>   - mv, that not finished
>>   - mplayer, that try to get next file (seems that current one was in
>> cache, about 130 MiB in size)
>>
>> And after this load avg was equal to 8.
>>
>> I've try to investigate further, see into some old patches, blame, but
>> I don't have any ideas.
>> So here is some more info about it:
>>
>> # Kernel version:
>>
>> $ uname -a
>> Linux home-spb 3.8.0custom.x86-64.1.8-10206-gb0af9cd-dirty #1 SMP Sun
>> Mar 3 20:39:58 MSK 2013 x86_64 GNU/Linux
>> $ grep sdb5 /proc/mounts
>> /dev/sdb5 /mnt/docs ext4 rw,relatime,data=ordered 0 0
>>
>> $ iostat 2 /dev/sdb
>> Linux 3.8.0custom.x86-64.1.8-10206-gb0af9cd-dirty (home-spb)
>> 04/20/2013      _x86_64_        (4 CPU)
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            3.31    0.00    0.59    0.56    0.00   95.54
>>
>> Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
>> sdb               0.76        34.68        32.07   29794122   27548981
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            2.39    0.00    0.38    0.00    0.00   97.23
>>
>> Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
>> sdb               0.00         0.00         0.00          0          0
>>
>> # dmesg snippets
>>
>> [836402.486062] INFO: task flush-8:16:24991 blocked for more than 120 seconds.
>> [836402.486738] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [836402.488080] flush-8:16      D ffff88022ea93a40     0 24991      2 0x00000000
>> [836402.489419]  ffff8801cb341c40 0000000000000046 ffff88020a20b9b8
>> ffff8802258fc6a0
>> [836402.490747]  0000000000013a40 ffff88020a20bfd8 ffff88020a20bfd8
>> ffff8801cb341c40
>> [836402.492091]  0000000000000092 ffff8801cb341c40 ffff8801fe8d2980
>> 0000000000000001
>> [836402.493457] Call Trace:
>> [836402.494133]  [<ffffffff813a9701>] ? rwsem_down_failed_common+0x1bd/0x1f6
>> [836402.494810]  [<ffffffff810c828b>] ? release_pages+0x159/0x168
>> [836402.495485]  [<ffffffff811fcea3>] ? call_rwsem_down_write_failed+0x13/0x20
>> [836402.496163]  [<ffffffff813a85c4>] ? down_write+0x25/0x27
>> [836402.496838]  [<ffffffff8115e264>] ? ext4_map_blocks+0x1d0/0x30e
>> [836402.497514]  [<ffffffff8116142d>] ? mpage_da_map_and_submit+0x8a/0x314
>> [836402.498185]  [<ffffffff81161cc7>] ? ext4_da_writepages+0x2ab/0x44e
>> [836402.498868]  [<ffffffff811e1f5e>] ? __blk_run_queue_uncond+0x1e/0x26
>> [836402.499544]  [<ffffffff811e1fd0>] ? queue_unplugged+0x59/0x7d
>> [836402.500297]  [<ffffffff8111d7a0>] ? __writeback_single_inode+0x4b/0x1d6
>> [836402.501019]  [<ffffffff8111df91>] ? writeback_sb_inodes+0x1dd/0x30c
>> [836402.501763]  [<ffffffff8111e129>] ? __writeback_inodes_wb+0x69/0xab
>> [836402.502503]  [<ffffffff8111e2a2>] ? wb_writeback+0x137/0x28c
>> [836402.503207]  [<ffffffff8111ecff>] ? wb_do_writeback+0x146/0x1bb
>> [836402.503922]  [<ffffffff81046fbe>] ? del_timer_sync+0x27/0x44
>> [836402.504636]  [<ffffffff8111ed74>] ? wb_do_writeback+0x1bb/0x1bb
>> [836402.505345]  [<ffffffff8111ee1c>] ? bdi_writeback_thread+0xa8/0x1db
>> [836402.506031]  [<ffffffff8111ed74>] ? wb_do_writeback+0x1bb/0x1bb
>> [836402.506739]  [<ffffffff8111ed74>] ? wb_do_writeback+0x1bb/0x1bb
>> [836402.507413]  [<ffffffff810551b0>] ? kthread+0x81/0x89
>> [836402.508089]  [<ffffffff8105512f>] ? __kthread_parkme+0x5c/0x5c
>> [836402.508795]  [<ffffffff813afafc>] ? ret_from_fork+0x7c/0xb0
>> [836402.509467]  [<ffffffff8105512f>] ? __kthread_parkme+0x5c/0x5c
>> [836402.510157] INFO: task mv:31543 blocked for more than 120 seconds.
>> [836402.510848] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [836402.512206] mv              D ffff88022ea13a40     0 31543  30551 0x00000000
>> [836402.513550]  ffff880224603170 0000000000000086 0000000000000000
>> ffffffff81613400
>> [836402.514900]  0000000000013a40 ffff880050b61fd8 ffff880050b61fd8
>> ffff880224603170
>> [836402.516237]  0000000000000246 ffff880225358800 0000000000000000
>> ffff880225358868
>> [836402.517598] Call Trace:
>> [836402.518266]  [<ffffffff811973f8>] ? start_this_handle+0x2d0/0x46b
>> [836402.518951]  [<ffffffff8105e78b>] ? should_resched+0x5/0x23
>> [836402.519630]  [<ffffffff810557d1>] ? bit_waitqueue+0x82/0x82
>> [836402.520305]  [<ffffffff8119772d>] ? jbd2__journal_start+0x94/0x154
>> [836402.521027]  [<ffffffff8116218f>] ? ext4_dirty_inode+0x1e/0x51
>> [836402.521743]  [<ffffffff8111d999>] ? __mark_inode_dirty+0x6e/0x1f1
>> [836402.522449]  [<ffffffff811105f4>] ? update_time+0xa0/0xa9
>> [836402.523121]  [<ffffffff81110cba>] ? touch_atime+0xe3/0x11c
>> [836402.523795]  [<ffffffff810bfe8a>] ? generic_file_aio_read+0x55a/0x5b9
>> [836402.524476]  [<ffffffff810fe062>] ? do_sync_read+0x62/0x9b
>> [836402.525156]  [<ffffffff810fe663>] ? vfs_read+0x93/0xf5
>> [836402.525842]  [<ffffffff810fe716>] ? sys_read+0x51/0x80
>> [836402.526527]  [<ffffffff813afba9>] ? system_call_fastpath+0x16/0x1b
>> [836522.558599] INFO: task jbd2/sdb5-8:1794 blocked for more than 120 seconds.
>> [836522.559400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [836522.560999] jbd2/sdb5-8     D ffff88022ea93a40     0  1794      2 0x00000000
>> [836522.562591]  ffff8802253b3170 0000000000000046 0000000131cd9ccb
>> ffff8802258fc6a0
>> [836522.564169]  0000000000013a40 ffff880225161fd8 ffff880225161fd8
>> ffff8802253b3170
>> [836522.565565]  0000000000000246 ffff880225358800 ffff880225161d80
>> ffff88008fd7ca70
>> [836522.566953] Call Trace:
>> [836522.567660]  [<ffffffff8119931d>] ?
>> jbd2_journal_commit_transaction+0x22f/0x135e
>> [836522.569014]  [<ffffffff8100d5ff>] ? __switch_to+0x18d/0x3f8
>> [836522.569780]  [<ffffffff810557d1>] ? bit_waitqueue+0x82/0x82
>> [836522.570519]  [<ffffffff81046e67>] ? lock_timer_base.isra.37+0x23/0x48
>> [836522.571225]  [<ffffffff81046f8c>] ? try_to_del_timer_sync+0x4e/0x59
>> [836522.571958]  [<ffffffff8119f0df>] ? kjournald2+0xb7/0x239
>> [836522.572688]  [<ffffffff810557d1>] ? bit_waitqueue+0x82/0x82
>> [836522.573386]  [<ffffffff8119f028>] ? journal_init_common+0x165/0x165
>> [836522.574093]  [<ffffffff8119f028>] ? journal_init_common+0x165/0x165
>> [836522.574772]  [<ffffffff810551b0>] ? kthread+0x81/0x89
>> [836522.575447]  [<ffffffff8105512f>] ? __kthread_parkme+0x5c/0x5c
>> [836522.576127]  [<ffffffff813afafc>] ? ret_from_fork+0x7c/0xb0
>> [836522.576804]  [<ffffffff8105512f>] ? __kthread_parkme+0x5c/0x5c
>> [836522.577526] INFO: task ktorrent:11063 blocked for more than 120 seconds.
>> [836522.578205] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [836522.579569] ktorrent        D ffff88022eb93a40     0 11063      1 0x00000000
>> [836522.580927]  ffff8801cb3454c0 0000000000000086 0000000000000292
>> ffff8802258fd4c0
>> [836522.582274]  0000000000013a40 ffff880050f73fd8 ffff880050f73fd8
>> ffff8801cb3454c0
>> [836522.583664]  0000000000000246 ffff880225358800 ffff880050f73ab0
>> ffff880225358824
>> [836522.585006] Call Trace:
>> [836522.585694]  [<ffffffff8119def0>] ? jbd2_log_wait_commit+0xc3/0x118
>> [836522.586371]  [<ffffffff810557d1>] ? bit_waitqueue+0x82/0x82
>> [836522.587048]  [<ffffffff8119ff52>] ?
>> jbd2_journal_force_commit_nested+0x63/0x91
>> [836522.588388]  [<ffffffff8115d729>] ? ext4_da_get_block_prep+0x329/0x485
>> [836522.589082]  [<ffffffff81122ef3>] ? alloc_buffer_head+0x18/0x5b
>> [836522.589775]  [<ffffffff81124d80>] ? __block_write_begin+0x125/0x27d
>> [836522.590467]  [<ffffffff81072724>] ? current_kernel_time+0x9/0x30
>> [836522.591144]  [<ffffffff8115d400>] ? ext4_num_dirty_pages.isra.44+0x17f/0x17f
>> [836522.592502]  [<ffffffff810ffe13>] ? __sb_start_write+0xb6/0xe8
>> [836522.593195]  [<ffffffff810de216>] ? vma_gap_callbacks_propagate+0x13/0x2c
>> [836522.593899]  [<ffffffff8115d400>] ? ext4_num_dirty_pages.isra.44+0x17f/0x17f
>> [836522.595252]  [<ffffffff81124f4d>] ? __block_page_mkwrite+0x75/0xb7
>> [836522.595932]  [<ffffffff81162382>] ? ext4_page_mkwrite+0x8e/0x2a4
>> [836522.596614]  [<ffffffff810d8e36>] ? __do_fault+0x15b/0x35d
>> [836522.597304]  [<ffffffff810db20d>] ? handle_pte_fault+0x28b/0x708
>> [836522.597981]  [<ffffffff81033da9>] ? pte_alloc_one+0x2f/0x39
>> [836522.598670]  [<ffffffff813aca20>] ? __do_page_fault+0x32d/0x375
>> [836522.599346]  [<ffffffff810d3498>] ? vm_mmap_pgoff+0x82/0xab
>> [836522.600023]  [<ffffffff813a9898>] ? _raw_spin_lock_irq+0xb/0x15
>> [836522.600700]  [<ffffffff8104b8ca>] ? do_sigaction+0xc2/0x167
>> [836522.601373]  [<ffffffff8105e82c>] ? finish_task_switch+0x83/0xb4
>> [836522.602066]  [<ffffffff8105e78b>] ? should_resched+0x5/0x23
>> [836522.602757]  [<ffffffff813a8dd8>] ? _cond_resched+0x6/0x1b
>> [836522.603432]  [<ffffffff813a9e58>] ? page_fault+0x28/0x30
>>
>> # Pointers to sources
>>
>> (gdb) list *(jbd2_log_wait_commit+0xc3)
>> 0xffffffff8119def0 is in jbd2_log_wait_commit (fs/jbd2/journal.c:699).
>> 694             while (tid_gt(tid, journal->j_commit_sequence)) {
>> 695                     jbd_debug(1, "JBD2: want %d, j_commit_sequence=%d\n",
>> 696                                       tid, journal->j_commit_sequence);
>> 697                     wake_up(&journal->j_wait_commit);
>> 698                     read_unlock(&journal->j_state_lock);
>> 699                     wait_event(journal->j_wait_done_commit,
>> 700                                     !tid_gt(tid,
>> journal->j_commit_sequence));
>> 701                     read_lock(&journal->j_state_lock);
>> 702             }
>> 703             read_unlock(&journal->j_state_lock);
>> (gdb) list *(jbd2_journal_commit_transaction+0x22f)
>> 0xffffffff8119931d is in jbd2_journal_commit_transaction (fs/jbd2/commit.c:457).
>> 452                                             TASK_UNINTERRUPTIBLE);
>> 453                     if (atomic_read(&commit_transaction->t_updates)) {
>> 454                             spin_unlock(&commit_transaction->t_handle_lock);
>> 455                             write_unlock(&journal->j_state_lock);
>> 456                             schedule();
>> 457                             write_lock(&journal->j_state_lock);
>> 458                             spin_lock(&commit_transaction->t_handle_lock);
>> 459                     }
>> 460                     finish_wait(&journal->j_wait_updates, &wait);
>> 461             }
>> (gdb) list *(start_this_handle+0x2d0)
>> 0xffffffff811973f8 is in start_this_handle (fs/jbd2/transaction.c:237).
>> warning: Source file is more recent than executable.
>> 232
>> 233                     prepare_to_wait(&journal->j_wait_transaction_locked,
>> 234                                             &wait, TASK_UNINTERRUPTIBLE);
>> 235                     read_unlock(&journal->j_state_lock);
>> 236                     schedule();
>> 237                     finish_wait(&journal->j_wait_transaction_locked, &wait);
>> 238                     goto repeat;
>> 239             }
>> 240
>> 241             /*
>> (gdb) l *(ext4_map_blocks+0x1d0)
>> 0xffffffff8115e264 is in ext4_map_blocks (fs/ext4/inode.c:609).
>> 604              * if the caller is from delayed allocation writeout path
>> 605              * we have already reserved fs blocks for allocation
>> 606              * let the underlying get_block() function know to
>> 607              * avoid double accounting
>> 608              */
>> 609             if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
>> 610                     ext4_set_inode_state(inode,
>> EXT4_STATE_DELALLOC_RESERVED);
>> 611             /*
>> 612              * We need to check for EXT4 here because migrate
>> 613              * could have changed the inode type in between
>> (gdb) l *(ext4_map_blocks+(0x1d0-8)) # I actually think that instead
>> of previous command you want to see at this one.
>> 0xffffffff8115e25c is in ext4_map_blocks (fs/ext4/inode.c:601).
>> 596              * New blocks allocate and/or writing to uninitialized extent
>> 597              * will possibly result in updating i_data, so we take
>> 598              * the write lock of i_data_sem, and call get_blocks()
>> 599              * with create == 1 flag.
>> 600              */
>> 601             down_write((&EXT4_I(inode)->i_data_sem));
>> 602
>> 603             /*
>> 604              * if the caller is from delayed allocation writeout path
>> 605              * we have already reserved fs blocks for allocation
>>
>> --
>> Respectfully
>> Azat Khuzhin
>> Primary email a3at.mail@...il.com
>
>
>
> --
> Respectfully
> Azat Khuzhin
> Primary email a3at.mail@...il.com



--
Respectfully
Azat Khuzhin
Primary email a3at.mail@...il.com
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ