lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e0b96b01-a033-4e28-08c8-fc04f8edd2fc@intel.com>
Date:   Mon, 31 Jul 2017 10:21:12 +0300
From:   Tomi Sarvela <tomi.p.sarvela@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Martin Peres <martin.peres@...ux.intel.com>,
        jeffy.chen@...k-chips.com, linux-kernel@...r.kernel.org
Subject: Re: Suspend-resume failure on Intel Eagle Lake Core2Duo

On 28/07/17 19:26, Thomas Gleixner wrote:
> On Fri, 28 Jul 2017, Tomi Sarvela wrote:
>> On 28/07/17 17:50, Thomas Gleixner wrote:
>>> On Fri, 28 Jul 2017, Tomi Sarvela wrote:
>>>> On 28/07/17 17:13, Thomas Gleixner wrote:
>>>>> On Fri, 28 Jul 2017, Tomi Sarvela wrote:
>>>>>> On 28/07/17 16:15, Thomas Gleixner wrote:
>>>>>>> Another question. Is the machine completely dead or not?
>>>>>>
>>>>>> Completely dead. Powerled is on, so host isn't shut down.
>>>>>
>>>>> So that means it does not even power the machine down. That's what I
>>>>> expected least.
>>>>>
>>>>>> Serial or network if don't give any signs of life.
>>>>>
>>>>>> Patch applies cleanly but still getting the same error:
>>>>>
>>>>> Sorry for the noise. I'm an idiot trying to do 10 things at once. This
>>>>> time
>>>>> it actually compiles and links.
>>>>>
>>>>> If the machine does still not powerdown with this applied, then please
>>>>> redo
>>>>> the 'platform' test and grab the trace for that one.
>>>>
>>>> This patch fixes the issue. Below is the dmesg from the testrun (sorry for
>>>> the spam, we're primarily testing i915 issues).
>>>
>>> Can you please retrieve the trace data from:
>>>
>>> /sys/kernel/debug/tracing/trace
>>>
>>> and provide that. The dmesg does not help much.
>>
>> Right, here you go.
> 
> Thanks for providing the data. Just to be sure, that data was from a real
> suspend, not the 'platform' test, right?
> 
> If so, that does not make any sense at all. The patch merily changes the
> enable/resume path and adds the debug trace printks which have no influence
> on the disable logic. But you said that the machine does not power off in
> the bad case. That does not make any sense at all as the enable logic is
> not involved at all in the suspend path.
> 
> Did you change anything else compared to the tests before ?

I did check that the problem persisted in linus-HEAD before testing your 
patch. The testing was done in order (reading from console logs I happen 
to still have in one window):

- reboot to patched kernel

[    0.000000] Linux version 4.13.0-rc2+ (testrunner@elk) (gcc version 
6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #5 SMP PREEMPT Fri Jul 28 
17:15:47 EEST 2017
[    0.000000] Command line: BOOT_IMAGE=/boot/tsa.efi root=/dev/sda1 
console=ttyS0,115200n8 console=tty0 intel_iommu=igfx_off drm.debug=0xe 
nmi_watchdog=panic,auto panic=1 softdog.soft_panic=1 
scsi_mod.use_blk_mq=0 rootwait ro 3

- "real" 15sec suspend test through IGT/piglit and rtcwake

$ ./scripts/run-tests.sh -vt igt@..._exec_suspend@...ic-s3 -x devices

- dmesg to go with suspend

[ 1189.597665] Suspended for 14.825 seconds
[ 1189.597665] Delta way too big! 18446743991837909721 
ts=18446744056274518328 write stamp = 64436608607
                If you just came from a suspend/resume,
                please switch to the trace global clock:
                  echo global > /sys/kernel/debug/tracing/trace_clock
[ 1189.597665] ------------[ cut here ]------------
[ 1189.597665] WARNING: CPU: 0 PID: 1332 at 
kernel/trace/ring_buffer.c:2647 rb_handle_timestamp.isra.32+0x71/0x80
[ 1189.597665] Modules linked in: snd_hda_codec_realtek i915 
snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core 
coretemp snd_pcm e1000e lpc_ich mei_me ptp mei pps_core
[ 1189.597665] CPU: 0 PID: 1332 Comm: rtcwake Tainted: G     U 
4.13.0-rc2+ #5
[ 1189.597665] Hardware name: Hewlett-Packard HP Compaq 8000 Elite CMT 
PC/3647h, BIOS 786G7 v01.13 07/20/2011
[ 1189.597665] task: ffff88010dce9880 task.stack: ffffc900000c8000
[ 1189.597665] RIP: 0010:rb_handle_timestamp.isra.32+0x71/0x80
[ 1189.597665] RSP: 0018:ffffc900000cbab0 EFLAGS: 00010082
[ 1189.597665] RAX: 00000000000000e0 RBX: ffffc900000cbad0 RCX: 
0000000000000004
[ 1189.597665] RDX: 0000000080000004 RSI: 0000000000000082 RDI: 
00000000ffffffff
[ 1189.597665] RBP: ffffc900000cbac0 R08: 0000000000000000 R09: 
00000000000000e0
[ 1189.597665] R10: ffffc900000cbbe0 R11: 00000000000ebc08 R12: 
ffff880117008890
[ 1189.597665] R13: ffff8801170088b0 R14: 00000000000003e8 R15: 
0000000000000005
[ 1189.597665] FS:  00007f93d957e700(0000) GS:ffff88011bc00000(0000) 
knlGS:0000000000000000
[ 1189.597665] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1189.597665] CR2: 000000f00255b068 CR3: 00000000d9063000 CR4: 
00000000000406f0
[ 1189.597665] Call Trace:
[ 1189.597665]  ring_buffer_lock_reserve+0x1fa/0x350
[ 1189.597665]  trace_vbprintk+0xdc/0x260
[ 1189.597665]  ? __irq_disable+0x1b/0xc0
[ 1189.597665]  __trace_bprintk+0x4a/0x60
[ 1189.597665]  ? preempt_count_add+0x9e/0xb0
[ 1189.597665]  __irq_disable+0x3f/0xc0
[ 1189.597665]  irq_disable+0x17/0x20
[ 1189.597665]  __disable_irq_nosync+0x59/0x70
[ 1189.597665]  disable_hardirq+0x11/0x30
[ 1189.597665]  hpet_msi_resume+0x85/0xd0
[ 1189.597665]  clockevents_tick_resume+0x14/0x20
[ 1189.597665]  tick_resume_local+0x32/0x60
[ 1189.597665]  tick_resume+0x13/0x20
[ 1189.597665]  timekeeping_resume+0x149/0x1a0
[ 1189.597665]  syscore_resume+0x4b/0x190
[ 1189.597665]  ? syscore_resume+0x4b/0x190
[ 1189.597665]  suspend_devices_and_enter+0x6b9/0x810
[ 1189.597665]  pm_suspend+0x367/0x540
[ 1189.597665]  state_store+0x7e/0xf0
[ 1189.597665]  kobj_attr_store+0xf/0x20
[ 1189.597665]  sysfs_kf_write+0x37/0x40
[ 1189.597665]  kernfs_fop_write+0x110/0x1a0
[ 1189.597665]  __vfs_write+0x28/0x130
[ 1189.597665]  ? __this_cpu_preempt_check+0x13/0x20
[ 1189.597665]  ? __sb_start_write+0x55/0xe0
[ 1189.597665]  vfs_write+0xb6/0x1a0
[ 1189.597665]  SyS_write+0x46/0xb0
[ 1189.597665]  entry_SYSCALL_64_fastpath+0x17/0x98
[ 1189.597665] RIP: 0033:0x7f93d90ac8f0
[ 1189.597665] RSP: 002b:00007ffd80a9ca88 EFLAGS: 00000246 ORIG_RAX: 
0000000000000001
[ 1189.597665] RAX: ffffffffffffffda RBX: 000000f00255a050 RCX: 
00007f93d90ac8f0
[ 1189.597665] RDX: 0000000000000004 RSI: 000000f00255a060 RDI: 
0000000000000007
[ 1189.597665] RBP: 00007f93d9375b00 R08: 000000f002557d90 R09: 
00007f93d957e700
[ 1189.597665] R10: 00007f93d9375b58 R11: 0000000000000246 R12: 
00007f93d9375b58
[ 1189.597665] R13: 00007f93d9375b58 R14: 000000000000270f R15: 
0000000000001010
[ 1189.597665] Code: f0 48 8b 73 08 85 c0 48 8b 13 48 c7 c0 88 9a a5 81 
49 c7 c0 1a 3b a8 81 4c 0f 44 c0 48 8b 0f 48 c7 c7 10 9b a5 81 e8 c0 c8 
fa ff <0f> ff eb a7 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56
[ 1189.597665] ---[ end trace 7d99ac836f161565 ]---

- printing out the trace from /sys/kernel/debug/tracing/trace


Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ