linux-kernel - Re: 2.6.21-rc5-mm1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200703281844.58698.m.kozlowski@tuxland.pl>
Date:	Wed, 28 Mar 2007 18:44:57 +0200
From:	Mariusz Kozłowski <m.kozlowski@...land.pl>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: 2.6.21-rc5-mm1

Hello,

	I run 2.6.21-rc4-mm1 with no hangs for a week.
Then when 2.6.21-rc5-mm1 showed up so I switched to it. Unfortunately
today my laptop hunged twice in a similar way as described here:

http://www.ussg.iu.edu/hypermail/linux/kernel/0703.0/index.html#1165

The difference is that it happened when I closed the lid in my laptop.
When reopend it the box was frozen (ACPI?). Again disk I/O was dead
so nothing was found in syslog.

I tried to reproduce it and capture something with netconsole.
I tortured the box for a few hours but the system did not hang. I pushed
the box real hard and what I got was only oom-killer firing etc ;-)
Anyway I found something else you might be interested in:



1) This happened when 'echo 3 > /proc/sys/vm/drop_caches' on really
   busy system.

 =======================================================
 [ INFO: possible circular locking dependency detected ]
 2.6.21-rc5-mm1 #1
 -------------------------------------------------------
 bash/20633 is trying to acquire lock:
  (&journal->j_list_lock){--..}, at: [<c01bb60e>] journal_try_to_free_buffers+0x151/0x1bc
 
 but task is already holding lock:
  (inode_lock){--..}, at: [<c0187d46>] drop_pagecache+0x58/0xf9
 
 which lock already depends on the new lock.
 
 
 the existing dependency chain (in reverse order) is:
 
 -> #1 (inode_lock){--..}:
        [<c0132a0a>] __lock_acquire+0xe31/0xfe9
        [<c0132c2b>] lock_acquire+0x69/0x83
        [<c0406fb5>] _spin_lock+0x35/0x42
        [<c0187734>] __mark_inode_dirty+0x4f/0x163
        [<c01533c2>] __set_page_dirty_nobuffers+0x99/0xf5
        [<c018b0db>] mark_buffer_dirty+0x1f/0x25
        [<c01b8e3d>] __journal_temp_unlink_buffer+0x88/0x1a8
        [<c01b9192>] __journal_unfile_buffer+0xb/0x15
        [<c01b9289>] __journal_refile_buffer+0xed/0xef
        [<c01bc57c>] journal_commit_transaction+0xd8b/0x127f
        [<c01c033b>] kjournald+0xac/0x1ed
        [<c0127a05>] kthread+0xa2/0xc9
        [<c01042af>] kernel_thread_helper+0x7/0x18
        [<ffffffff>] 0xffffffff
 
 -> #0 (&journal->j_list_lock){--..}:
        [<c0132873>] __lock_acquire+0xc9a/0xfe9
        [<c0132c2b>] lock_acquire+0x69/0x83
        [<c0406fb5>] _spin_lock+0x35/0x42
        [<c01bb60e>] journal_try_to_free_buffers+0x151/0x1bc
        [<c01ad9f3>] ext3_releasepage+0x3f/0x76
        [<c014f6e4>] try_to_release_page+0x2f/0x4a
        [<c0156588>] invalidate_mapping_pages+0xb6/0xee
        [<c0187d94>] drop_pagecache+0xa6/0xf9
        [<c0187e3b>] drop_caches_sysctl_handler+0x54/0x69
        [<c01a2ccb>] proc_sys_write+0x80/0x8a
        [<c016ce3d>] vfs_write+0x8b/0x11f
        [<c016d371>] sys_write+0x3d/0x64
        [<c0103f44>] sysenter_past_esp+0x5d/0x99
        [<ffffffff>] 0xffffffff
 
 other info that might help us debug this:
 
 2 locks held by bash/20633:
  #0:  (&type->s_umount_key#16){----}, at: [<c0187d35>] drop_pagecache+0x47/0xf9
  #1:  (inode_lock){--..}, at: [<c0187d46>] drop_pagecache+0x58/0xf9
 
 stack backtrace:
  [<c0104614>] show_trace_log_lvl+0x1a/0x30
  [<c01052c9>] show_trace+0x12/0x14
  [<c0105355>] dump_stack+0x16/0x18
  [<c01309f0>] print_circular_bug_tail+0x68/0x71
  [<c0132873>] __lock_acquire+0xc9a/0xfe9
  [<c0132c2b>] lock_acquire+0x69/0x83
  [<c0406fb5>] _spin_lock+0x35/0x42
  [<c01bb60e>] journal_try_to_free_buffers+0x151/0x1bc
  [<c01ad9f3>] ext3_releasepage+0x3f/0x76
  [<c014f6e4>] try_to_release_page+0x2f/0x4a
  [<c0156588>] invalidate_mapping_pages+0xb6/0xee
  [<c0187d94>] drop_pagecache+0xa6/0xf9
  [<c0187e3b>] drop_caches_sysctl_handler+0x54/0x69
  [<c01a2ccb>] proc_sys_write+0x80/0x8a
  [<c016ce3d>] vfs_write+0x8b/0x11f
  [<c016d371>] sys_write+0x3d/0x64
  [<c0103f44>] sysenter_past_esp+0x5d/0x99
 =======================



2) This was found a couple minutes later when the system was
   really busy and close to oom condition.

 INFO: lockdep is turned off.
 BUG: soft lockup detected on CPU#0!
  [<c0104614>] show_trace_log_lvl+0x1a/0x30
  [<c01052c9>] show_trace+0x12/0x14
  [<c0105355>] dump_stack+0x16/0x18
  [<c01467a0>] softlockup_tick+0x81/0xa8
  [<c011e4dc>] run_local_timers+0x12/0x14
  [<c011e8dd>] update_process_times+0x2b/0x63
  [<c012e4be>] tick_sched_timer+0x4d/0x9e
  [<c012af00>] hrtimer_interrupt+0x12e/0x1a6
  [<c0106f56>] timer_interrupt+0xe/0x15
  [<c0146af3>] handle_IRQ_event+0x28/0x59
  [<c01480a7>] handle_level_irq+0x6e/0xe7
  [<c0105d3e>] do_IRQ+0x3d/0x7f
  [<c01041b2>] common_interrupt+0x2e/0x34
  [<c011afef>] do_softirq+0x4d/0x50
  [<c011b263>] irq_exit+0x7e/0x80
  [<c0105d43>] do_IRQ+0x42/0x7f
  [<c01041b2>] common_interrupt+0x2e/0x34
  [<c0178bf2>] core_sys_select+0x1c6/0x310
  [<c0179101>] sys_select+0x39/0x18f
  [<c0103f44>] sysenter_past_esp+0x5d/0x99
  =======================
 Clocksource tsc unstable (delta = 9372804176 ns)
 Time: acpi_pm clocksource has been installed.

Please find .config attached. Not sure who to CC on this (as usual ;-)).

Regards,

	Mariusz Kozlowski

View attachment ".config" of type "text/plain" (42516 bytes)