[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5291C948.1080305@gmail.com>
Date: Sun, 24 Nov 2013 10:39:20 +0100
From: Francis Moreau <francis.moro@...il.com>
To: Thomas Gleixner <tglx@...utronix.de>,
"Rafael J. Wysocki" <rjw@...ysocki.net>
CC: Jingoo Han <jg1.han@...sung.com>, 'Borislav Petkov' <bp@...en8.de>,
'Wei WANG' <wei_wang@...lsil.com.cn>,
'LKML' <linux-kernel@...r.kernel.org>,
'Samuel Ortiz' <sameo@...ux.intel.com>,
'Chris Ball' <cjb@...top.org>
Subject: Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hello Thomas
On 11/22/2013 11:27 PM, Thomas Gleixner wrote:
> On Fri, 22 Nov 2013, Rafael J. Wysocki wrote:
>> On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote:
>>> Ok, I've finally managed to find out the bad commit:
>>> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
>>> over system PM transitions
>>>
>>> I verified that the parent commit doesn't have the problem.
>>
>> Interesting.
>>
>>> Rafael, you're the man now ;)
>>
>> I kind of don't see how that commit may result in behavior that you
>> described earlier in the thread.
>>
>> You get a memory corruption that seems to have started to happen because
>> we're holding an additional lock over suspend resume now. Something's fishy
>> on that machine and we need to figure out what it is.
>
> The hickup happens in the timer softirq.
>
> @Francis: Did you try to enable DEBUG_OBJECTS.*. If not please give it
> a try.
This looks like it was a good idea.
The kernel now outputs the following traces after resuming.
[ 26.973928] WARNING: CPU: 0 PID: 4 at lib/debugobjects.c:260
debug_print_object+0x83/0xa0()
[ 26.973932] ODEBUG: free active (active state 0) object type:
timer_list hint: delayed_work_timer_fn+0x0/0x20
[ 26.973972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp
rtsx_pci_ms coretemp memstick kvm_intel i2c_i801 iTCO_wdt
iTCO_vendor_support i915 i2c_algo_bit intel_agp intel_gtt drm_kms_helper
r8169 drm kvm mii agpgart i2c_core lpc_ich ac shpchp crc32c_intel
battery thermal wmi evdev mei_me video mei button mperf processor
serio_raw microcode ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod
usb_storage rtsx_pci_sdmmc mmc_core ahci libahci libata ehci_pci
ehci_hcd xhci_hcd scsi_mod rtsx_pci usbcore usb_common
[ 26.974013] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted
3.11.0-rc2-ARCH #64
[ 26.974014] Hardware name: CLEVO CO. W55xEU
/W55xEU , BIOS 4.6.5
03/05/2013
[ 26.974019] Workqueue: kacpi_hotplug hotplug_event_work
[ 26.974020] 0000000000000009 ffff880407d0da18 ffffffff81459fe9
ffff880407d0da60
[ 26.974023] ffff880407d0da50 ffffffff8104dc7d ffff880407fad488
ffffffff81836fc0
[ 26.974025] ffffffff81701358 ffffffff81afef70 0000000000000003
ffff880407d0dab0
[ 26.974027] Call Trace:
[ 26.974031] [<ffffffff81459fe9>] dump_stack+0x54/0x8d
[ 26.974043] [<ffffffff8104dc7d>] warn_slowpath_common+0x7d/0xa0
[ 26.974044] [<ffffffff8104dcec>] warn_slowpath_fmt+0x4c/0x50
[ 26.974047] [<ffffffff81261433>] debug_print_object+0x83/0xa0
[ 26.974050] [<ffffffff8106b820>] ? queue_work_on+0x50/0x50
[ 26.974053] [<ffffffff81261c2b>] __debug_check_no_obj_freed+0x1fb/0x240
[ 26.974059] [<ffffffffa008e959>] ? rtsx_pci_remove+0x119/0x1d0
[rtsx_pci]
[ 26.974062] [<ffffffff81262619>] debug_check_no_obj_freed+0x19/0x20
[ 26.974065] [<ffffffff8116f861>] kfree+0x191/0x210
[ 26.974069] [<ffffffff813819e0>] ? pcibios_disable_device+0x20/0x30
[ 26.974072] [<ffffffffa008e959>] ? rtsx_pci_remove+0x119/0x1d0
[rtsx_pci]
[ 26.974075] [<ffffffffa008e959>] rtsx_pci_remove+0x119/0x1d0 [rtsx_pci]
[ 26.974079] [<ffffffff8128004b>] pci_device_remove+0x3b/0xb0
[ 26.974092] [<ffffffff8132c92f>] __device_release_driver+0x7f/0xf0
[ 26.974094] [<ffffffff8132c9c3>] device_release_driver+0x23/0x30
[ 26.974096] [<ffffffff8132c194>] bus_remove_device+0xf4/0x170
[ 26.974098] [<ffffffff81328c55>] device_del+0x135/0x1d0
[ 26.974108] [<ffffffff8127ae24>] pci_stop_bus_device+0x94/0xa0
[ 26.974110] [<ffffffff8127af32>]
pci_stop_and_remove_bus_device+0x12/0x20
[ 26.974113] [<ffffffff81297466>] disable_slot+0x76/0xd0
[ 26.974115] [<ffffffff81297568>] acpiphp_check_bridge+0xa8/0xd0
[ 26.974118] [<ffffffff81297c8a>] hotplug_event+0xfa/0x210
[ 26.974120] [<ffffffff81297dc7>] hotplug_event_work+0x27/0x60
[ 26.974123] [<ffffffff8106c178>] process_one_work+0x178/0x470
[ 26.974125] [<ffffffff8106cb91>] worker_thread+0x121/0x3a0
[ 26.974127] [<ffffffff8106ca70>] ? manage_workers.isra.21+0x2b0/0x2b0
[ 26.974130] [<ffffffff81073a50>] kthread+0xc0/0xd0
[ 26.974132] [<ffffffff81073990>] ? kthread_create_on_node+0x120/0x120
[ 26.974135] [<ffffffff814688ec>] ret_from_fork+0x7c/0xb0
[ 26.974137] [<ffffffff81073990>] ? kthread_create_on_node+0x120/0x120
[ 26.974139] ---[ end trace 0895c2e7925b5485 ]---
Also the kernel doesn't panic anymore.
I'm also attaching the dmesg when CONFIG_DEBUG_KOBJECT and
CONFIG_DEBUG_OBJECT* were activated.
Thanks.
Download attachment "dmesg-with-debug-objects.txt.gz" of type "application/gzip" (62316 bytes)
Powered by blists - more mailing lists