lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061114095818.GA2541@zip.com.au>
Date:	Tue, 14 Nov 2006 20:58:18 +1100
From:	CaT <cat@....com.au>
To:	"Dennis J.A. Bijwaard" <dennis@...22032063.dsl.speedlinq.nl>
Cc:	Andrew Morton <akpm@...l.org>, bijwaard@...il.com, sct@...hat.com,
	adilger@...sterfs.com, linux-kernel@...r.kernel.org
Subject: Re: BUG: soft lockup detected on CPU#0! in sys_close and ext3

On Sun, Oct 15, 2006 at 11:58:55PM +0200, Dennis J.A. Bijwaard wrote:
> Hi Andrew,
> 
> Thanks for your reply. The machine has 512MB and some more swap:
> 
> Mem:    510960k total,   504876k used,     6084k free,     1868k buffers
> Swap:   674640k total,     2652k used,   671988k free,   354832k cached
> 
> Machine may be slow for current standards, it has 2 * 500Mhz

This looks like what I have come across yesterda, except my machine is
far from slow. It's a dual core woodcrest 5130 cpu (2Ghz) with 4gb of
ram. My call trace looks very similar and occurs when sda1 (for eg) is 
involved or md0 (raid0 in this case) is. It happens multiple times in 
each case (from memory).

Nov 13 09:46:04 localhost kernel: Call Trace:
Nov 13 09:46:04 localhost kernel:  [show_trace+185/628] show_trace+0xb9/0x274
Nov 13 09:46:04 localhost kernel:  [dump_stack+21/26] dump_stack+0x15/0x1a
Nov 13 09:46:04 localhost kernel:  [softlockup_tick+231/264] softlockup_tick+0xe7/0x108
Nov 13 09:46:04 localhost kernel:  [run_local_timers+19/21] run_local_timers+0x13/0x15
Nov 13 09:46:04 localhost kernel:  [update_process_times+83/137] update_process_times+0x53/0x89
Nov 13 09:46:04 localhost kernel:  [smp_local_timer_interrupt+46/89] smp_local_timer_interrupt+0x2e/0x59
Nov 13 09:46:04 localhost kernel:  [smp_apic_timer_interrupt+92/106] smp_apic_timer_interrupt+0x5c/0x6a
Nov 13 09:46:04 localhost kernel:  [apic_timer_interrupt+107/112] apic_timer_interrupt+0x6b/0x70
Nov 13 09:46:04 localhost kernel:  [_write_unlock_irqrestore+81/95] _write_unlock_irqrestore+0x51/0x5f
Nov 13 09:46:04 localhost kernel:  [test_clear_page_dirty+189/224] test_clear_page_dirty+0xbd/0xe0
Nov 13 09:46:04 localhost kernel:  [try_to_free_buffers+123/174] try_to_free_buffers+0x7b/0xae
Nov 13 09:46:04 localhost kernel:  [try_to_release_page+68/74] try_to_release_page+0x44/0x4a
Nov 13 09:46:04 localhost kernel:  [invalidate_complete_page+54/191] invalidate_complete_page+0x36/0xbf
Nov 13 09:46:04 localhost kernel:  [invalidate_mapping_pages+157/280] invalidate_mapping_pages+0x9d/0x118
Nov 13 09:46:04 localhost kernel:  [invalidate_inode_pages+18/20] invalidate_inode_pages+0x12/0x14
Nov 13 09:46:04 localhost kernel:  [invalidate_bdev+47/56] invalidate_bdev+0x2f/0x38
Nov 13 09:46:04 localhost kernel:  [kill_bdev+25/52] kill_bdev+0x19/0x34
Nov 13 09:46:04 localhost kernel:  [__blkdev_put+88/314] __blkdev_put+0x58/0x13a
Nov 13 09:46:04 localhost kernel:  [blkdev_put+11/13] blkdev_put+0xb/0xd
Nov 13 09:46:04 localhost kernel:  [blkdev_close+52/61] blkdev_close+0x34/0x3d
Nov 13 09:46:04 localhost kernel:  [__fput+170/306] __fput+0xaa/0x132
Nov 13 09:46:04 localhost kernel:  [fput+21/23] fput+0x15/0x17
Nov 13 09:46:04 localhost kernel:  [filp_close+109/129] filp_close+0x6d/0x81
Nov 13 09:46:04 localhost kernel:  [sys_close+140/168] sys_close+0x8c/0xa8
Nov 13 09:46:04 localhost kernel:  [system_call+126/131] system_call+0x7e/0x83
Nov 13 09:46:04 localhost kernel:  [phys_startup_64+-1072859838/2147483392]

> > So the kernel was doing a lot of work, on a slow CPU.  Perhaps that simply
> > exceeded the softlockup timeout.  If that's true then the machine should
> > have recovered.  Once it did, and once it didn't.  I don't know why it
> > didn't.

It did for me in all cases. Should it be taking long enough to trigger
the softlock timeout to do this? The size of the device is approx 280gig.

-- 
    "To the extent that we overreact, we proffer the terrorists the
    greatest tribute."
    	- High Court Judge Michael Kirby
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ