[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200808182251.31113.rjw@sisk.pl>
Date: Mon, 18 Aug 2008 22:51:30 +0200
From: "Rafael J. Wysocki" <rjw@...k.pl>
To: "Vegard Nossum" <vegard.nossum@...il.com>
Cc: "Pavel Machek" <pavel@...e.cz>,
"Robert Richter" <robert.richter@....com>,
"Ingo Molnar" <mingo@...e.hu>, "Andi Kleen" <ak@...ux.intel.com>,
"Philippe Elie" <phil.el@...adoo.fr>,
"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>
Subject: Re: oprofile + hibernation = badness
On Monday, 18 of August 2008, Vegard Nossum wrote:
> Hi,
>
> I'm probably crazy to do it, but...
>
> $ opcontrol --start
> $ echo disk > /sys/power/state
>
> ...leads to lots of badness/strangeness. First, lots of APIC errors:
>
> APIC error on CPU0: 40(40)
> [repeat 8 times]
> APIC error on CPU1: 40(40)
> APIC error on CPU1: 40(40)
> APIC error on CPU0: 40(40)
>
> These keep on coming all through the suspend/shutdown sequence, also
> intermixing with other messages. I'm guessing oprofile is trying to
> NMI CPUs that have been brought down?
They CPUs are taken down after suspending devices. If you boot the
kernel with 'no_console_suspend' in the command line, there will be more
debug information to have a look at.
Also, you can do
# echo core > /sys/power/pm_test
before the '$ echo disk > /sys/power/state' (that only tests the suspend
sequence without actually hibernating) and see what symptoms will be
reproduced.
> Now I get some ACPI Exceptions, but I think that these are unrelated
> to starting oprofile, because I have seen them on regular shutdowns as
> well:
>
> PM: Syncing filesystems ... done.
> Freezing user space processes ... (elapsed 0.00 seconds) done.
> Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
> PM: Shrinking memory... done (0 pages freed)
> PM: Freed 0 kbytes in 0.44 seconds (0.00 MB/s)
> Suspending console(s) (use no_console_suspend to debug)
> ACPI Exception (exoparg2-0444): AE_AML_PACKAGE_LIMIT, Index
> (000000007) is beyond end of object [20080609]
> ACPI Error (psparse-0530): Method parse/execution failed
> [\_SB_.PCI0.IDE0.GTM_] (Node f783bfc0), AE_AML_PACKAGE_LIMIT
> ACPI Error (psparse-0530): Method parse/execution failed
> [\_SB_.PCI0.IDE0.CHN0._GTM] (Node f783bb40), AE_AML_PACKAGE_LIMIT
> ACPI handle has no context!
> serial 00:0d: disabled
> serial 00:06: disabled
> ehci_hcd 0000:00:1d.7: PCI INT A disabled
> uhci_hcd 0000:00:1d.3: PCI INT D disabled
> uhci_hcd 0000:00:1d.2: PCI INT C disabled
> uhci_hcd 0000:00:1d.1: PCI INT B disabled
> uhci_hcd 0000:00:1d.0: PCI INT A disabled
> ACPI: Preparing to enter system sleep state S4
>
> After that, I see the message "WQ on CPU0, prefer CPU1" a few times.
> This must also be bad.
>
> Now some warnings:
>
> ------------[ cut here ]------------
> WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/smp.c:328 s
> mp_call_function_mask+0x194/0x1a0()
> Pid: 3711, comm: bash Not tainted 2.6.27-rc3-00415-g122c9e0 #10
> [<c013584f>] warn_on_slowpath+0x4f/0x80
> [<c0585ff8>] ? log_sample+0xa8/0xc0
> [<c03e4e8d>] ? device_power_down+0x5d/0x110
> [<c0586140>] ? oprofile_add_ext_sample+0x40/0xc0
> [<c05861f4>] ? oprofile_add_sample+0x34/0x50
> [<c0588f5c>] ? p4_check_ctr+0x13c/0x160
> [<c0160c54>] smp_call_function_mask+0x194/0x1a0
> [<c0587940>] ? nmi_cpu_stop+0x0/0x30
> [<c014e04c>] ? __atomic_notifier_call_chain+0x3c/0x50
> [<c014e07a>] ? atomic_notifier_call_chain+0x1a/0x20
> [<c014e0ad>] ? notify_die+0x2d/0x30
> [<c0587940>] ? nmi_cpu_stop+0x0/0x30
> [<c0160c90>] smp_call_function+0x30/0x60
> [<c0587940>] ? nmi_cpu_stop+0x0/0x30
> [<c013a16c>] on_each_cpu+0x2c/0x80
> [<c05878d4>] nmi_stop+0x14/0x20
> [<c05878f5>] nmi_suspend+0x15/0x20
> [<c03dff63>] sysdev_suspend+0xa3/0x1f0
> [<c03e4f0c>] device_power_down+0xdc/0x110
> [<c016355b>] hibernation_snapshot+0x15b/0x220
> [<c013644b>] ? printk+0x1b/0x20
> [<c0162742>] ? freeze_processes+0x52/0xa0
> [<c0163700>] hibernate+0xe0/0x180
> [<c01621b0>] ? state_store+0x0/0xd0
> [<c016226f>] state_store+0xbf/0xd0
> [<c01621b0>] ? state_store+0x0/0xd0
> [<c03618d4>] kobj_attr_store+0x24/0x30
> [<c01e66c2>] sysfs_write_file+0xa2/0x100
> [<c01a85b6>] vfs_write+0x96/0x130
> [<c01e6620>] ? sysfs_write_file+0x0/0x100
> [<c01a8afd>] sys_write+0x3d/0x70
> [<c01040db>] sysenter_do_call+0x12/0x3f
> [<c0670000>] ? _read_unlock_irqrestore+0x60/0x70
> =======================
> ---[ end trace df36f449a70f1c8e ]---
> PM: Creating hibernation image:
> PM: Need to copy 54048 pages
> PM: Hibernation image created (54048 pages copied)
> x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> Intel machine check architecture supported.
> BUG: using smp_processor_id() in preemptible [00000000] code: bash/3711
> caller is intel_p4_mcheck_init+0xb9/0x2b0
> Pid: 3711, comm: bash Tainted: G W 2.6.27-rc3-00415-g122c9e0 #10
> [<c037664e>] debug_smp_processor_id+0xce/0xd0
> [<c010e919>] intel_p4_mcheck_init+0xb9/0x2b0
> [<c010dcd9>] mcheck_init+0x59/0x80
> [<c0589636>] restore_processor_state+0x1f6/0x230
> [<c01635d1>] hibernation_snapshot+0x1d1/0x220
> [<c013644b>] ? printk+0x1b/0x20
> [<c0162742>] ? freeze_processes+0x52/0xa0
> [<c0163700>] hibernate+0xe0/0x180
> [<c01621b0>] ? state_store+0x0/0xd0
> [<c016226f>] state_store+0xbf/0xd0
> [<c01621b0>] ? state_store+0x0/0xd0
> [<c03618d4>] kobj_attr_store+0x24/0x30
> [<c01e66c2>] sysfs_write_file+0xa2/0x100
> [<c01a85b6>] vfs_write+0x96/0x130
> [<c01e6620>] ? sysfs_write_file+0x0/0x100
> [<c01a8afd>] sys_write+0x3d/0x70
> [<c01040db>] sysenter_do_call+0x12/0x3f
> [<c0670000>] ? _read_unlock_irqrestore+0x60/0x70
> =======================
> Intel machine check reporting enabled on CPU#0.
> BUG: using smp_processor_id() in preemptible [00000000] code: bash/3711
> caller is intel_p4_mcheck_init+0x100/0x2b0
> Pid: 3711, comm: bash Tainted: G W 2.6.27-rc3-00415-g122c9e0 #10
> [<c037664e>] debug_smp_processor_id+0xce/0xd0
> [<c010e960>] intel_p4_mcheck_init+0x100/0x2b0
> [<c010dcd9>] mcheck_init+0x59/0x80
> [<c0589636>] restore_processor_state+0x1f6/0x230
> [<c01635d1>] hibernation_snapshot+0x1d1/0x220
> [<c013644b>] ? printk+0x1b/0x20
> [<c0162742>] ? freeze_processes+0x52/0xa0
> [<c0163700>] hibernate+0xe0/0x180
> [<c01621b0>] ? state_store+0x0/0xd0
> [<c016226f>] state_store+0xbf/0xd0
> [<c01621b0>] ? state_store+0x0/0xd0
> [<c03618d4>] kobj_attr_store+0x24/0x30
> [<c01e66c2>] sysfs_write_file+0xa2/0x100
> [<c01a85b6>] vfs_write+0x96/0x130
> [<c01e6620>] ? sysfs_write_file+0x0/0x100
> [<c01a8afd>] sys_write+0x3d/0x70
> [<c01040db>] sysenter_do_call+0x12/0x3f
> [<c0670000>] ? _read_unlock_irqrestore+0x60/0x70
> =======================
> CPU0: Intel P4/Xeon Extended MCE MSRs (24) available
> BUG: using smp_processor_id() in preemptible [00000000] code: bash/3711
> caller is intel_p4_mcheck_init+0x119/0x2b0
> Pid: 3711, comm: bash Tainted: G W 2.6.27-rc3-00415-g122c9e0 #10
> [<c037664e>] debug_smp_processor_id+0xce/0xd0
> [<c010e979>] intel_p4_mcheck_init+0x119/0x2b0
> [<c010dcd9>] mcheck_init+0x59/0x80
> [<c0589636>] restore_processor_state+0x1f6/0x230
> [<c01635d1>] hibernation_snapshot+0x1d1/0x220
> [<c013644b>] ? printk+0x1b/0x20
> [<c0162742>] ? freeze_processes+0x52/0xa0
> [<c0163700>] hibernate+0xe0/0x180
> [<c01621b0>] ? state_store+0x0/0xd0
> [<c016226f>] state_store+0xbf/0xd0
> [<c01621b0>] ? state_store+0x0/0xd0
> [<c03618d4>] kobj_attr_store+0x24/0x30
> [<c01e66c2>] sysfs_write_file+0xa2/0x100
> [<c01a85b6>] vfs_write+0x96/0x130
> [<c01e6620>] ? sysfs_write_file+0x0/0x100
> [<c01a8afd>] sys_write+0x3d/0x70
> [<c01040db>] sysenter_do_call+0x12/0x3f
> [<c0670000>] ? _read_unlock_irqrestore+0x60/0x70
> =======================
> CPU0: Thermal monitoring enabled
> Force enabled HPET at resume
> BUG: using smp_processor_id() in preemptible [00000000] code: bash/3711
> caller is retrigger_next_event+0x15/0xb0
> Pid: 3711, comm: bash Tainted: G W 2.6.27-rc3-00415-g122c9e0 #10
> [<c037664e>] debug_smp_processor_id+0xce/0xd0
> [<c014c6c5>] retrigger_next_event+0x15/0xb0
> [<c066d3bf>] ? preempt_schedule+0x3f/0x50
> [<c066fe1c>] ? _spin_unlock+0x3c/0x50
> [<c014c76a>] hres_timers_resume+0xa/0x10
> [<c01509bc>] timekeeping_resume+0x12c/0x180
> [<c03df894>] __sysdev_resume+0x14/0x50
> [<c03dfdae>] sysdev_resume+0x3e/0x80
> [<c03e4f4b>] device_power_up+0xb/0x20
> [<c01635b0>] hibernation_snapshot+0x1b0/0x220
> [<c013644b>] ? printk+0x1b/0x20
> [<c0162742>] ? freeze_processes+0x52/0xa0
> [<c0163700>] hibernate+0xe0/0x180
> [<c01621b0>] ? state_store+0x0/0xd0
> [<c016226f>] state_store+0xbf/0xd0
> [<c01621b0>] ? state_store+0x0/0xd0
> [<c03618d4>] kobj_attr_store+0x24/0x30
> [<c01e66c2>] sysfs_write_file+0xa2/0x100
> [<c01a85b6>] vfs_write+0x96/0x130
> [<c01e6620>] ? sysfs_write_file+0x0/0x100
> [<c01a8afd>] sys_write+0x3d/0x70
> [<c01040db>] sysenter_do_call+0x12/0x3f
> [<c0670000>] ? _read_unlock_irqrestore+0x60/0x70
> =======================
> Enabling non-boot CPUs ...
>
> ...and then, while writing the image, it looks something like this:
>
> APIC error on CPU1: 40(40)
> ^H^H^H^H 3%<7>APIC error on CPU1: 40(40)
> WQ on CPU0, prefer CPU1
> APIC error on CPU1: 40(40)
> [8 times]
> APIC error on CPU1: 40(40)
> WQ on CPU0, prefer CPU1
> ...
>
> The machine comes back with this spamming the logs (rate of about 2 KB/sec):
>
> WQ on CPU0, prefer CPU1
> APIC error on CPU1: 40(40)
> [9 times, repeat]
>
> I can still log in with ssh and reboot the machine, which finally
> gives this warning:
>
> BUG: sleeping function called from invalid context at kernel/sched.c:4625
> in_atomic():1, irqs_disabled():0
> INFO: lockdep is turned off.
> Pid: 4027, comm: oprofiled Tainted: G W 2.6.27-rc3-00415-g122c9e0 #10
> [<c012c98e>] __might_sleep+0xee/0x140
> [<c066c8d8>] wait_for_common+0x28/0x130
> [<c01595bb>] ? trace_hardirqs_on+0xb/0x10
> [<c066ff13>] ? _spin_unlock_irqrestore+0x43/0x70
> [<c066ca72>] wait_for_completion+0x12/0x20
> [<c014715d>] synchronize_rcu+0x2d/0x40
> [<c01471f0>] ? wakeme_after_rcu+0x0/0x10
> [<c01595bb>] ? trace_hardirqs_on+0xb/0x10
> [<c014e156>] atomic_notifier_chain_unregister+0x36/0x50
> [<c014e17f>] unregister_die_notifier+0xf/0x20
> [<c0587b84>] nmi_shutdown+0x44/0x80
> [<c0585d4a>] oprofile_shutdown+0x2a/0x60
> [<c0586b50>] event_buffer_release+0x10/0x40
> [<c01a8f26>] __fput+0xb6/0x180
> [<c01a92a9>] fput+0x19/0x20
> [<c01a63c7>] filp_close+0x47/0x70
> [<c013741b>] put_files_struct+0x9b/0xb0
> [<c0137472>] exit_files+0x42/0x60
> [<c01386a2>] do_exit+0x192/0x880
> [<c014d3b6>] ? up_read+0x16/0x30
> [<c0120493>] ? do_page_fault+0x2e3/0x700
> [<c01a7cc0>] ? do_sync_write+0x0/0x110
> [<c0138dc1>] do_group_exit+0x31/0x90
> [<c0138e2f>] sys_exit_group+0xf/0x20
> [<c01040db>] sysenter_do_call+0x12/0x3f
> =======================
> Sending all processes the KILL signal...
>
> The last time I did this, the machine also came up with a corrupt ext3
> root filesystem. But it didn't happen this time, so I guess it could
> also be unrelated.
>
> What would be the proper way to fix all this? Should oprofile be able
> to profile the suspend/resume code as well? Or should "echo disk" give
> -EBUSY if oprofile is running? Or
> do we simply insert a huge kludge in the form of mutual exclusion
> between oprofile and suspend in the Kconfig?
That last thing is probably the safest to do at the moment.
Apparently nmi_suspend() conflicts with oprofile somehow. Also, the offlining
of non-boot CPUs may confuse it. It would be helpful to check if the CPU
hotplug works with oprofile.
Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists