lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <10527192.nUPlyArG6x@jkrzyszt-mobl2.ger.corp.intel.com>
Date: Thu, 12 Sep 2024 19:33:29 +0200
From: Janusz Krzysztofik <janusz.krzysztofik@...ux.intel.com>
To: Kees Cook <kees@...nel.org>
Cc: Tony Luck <tony.luck@...el.com>,
 "Guilherme G. Piccoli" <gpiccoli@...lia.com>, linux-hardening@...r.kernel.org
Subject: pstore: backend (efi_pstore) writing error (-22)

Hi,

While working on manual reproduction of some "incomplete -- No warnings/
errors" issues reported by Intel GFX CI (https://intel-gfx-ci.01.org/) to 
https://gitlab.freedesktop.org/groups/drm/i915/-/issues, I've managed to 
capture a few machine check exception reports, followed by warnings from 
unsuccessful copy to pstore attempts.  I was using kernel versions from 
https://gitlab.freedesktop.org/drm/tip.git -- a linux-next like repository 
for integration testing of changes to drm subsystem, based on mainline.

Since I haven't found any similar reports on the net, could you please have 
a look and check if that's a known issue, and if not then if it can be fixed, 
or allowed to fail silently in the worst case if not fixable after an MCE hit?

Thanks,
Janusz


[11903.741247] mce: CPUs not responding to MCE broadcast (may include false positives): 0,2-3,5,9-11,13-15
[11903.741254] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
[11904.768716] Shutting down cpus with NMI
[11904.778998] Kernel Offset: disabled
[11904.793081] ------------[ cut here ]------------
[11904.793082] WARNING: CPU: 4 PID: 0 at arch/x86/kernel/fpu/core.c:60 kernel_fpu_begin_mask+0xe5/0x110
[11904.793089] Modules linked in: dm_crypt snd_hda_codec_hdmi i915 x86_pkg_temp_thermal coretemp kvm_intel kvm snd_intel_dspcfg snd_hda_codec mei_gsc_proxy prime_numbers snd_hwdep i2c_algo_bit crct10dif_pclmul wmi_bmof e1000e ttm video snd_hda_core i2c_i801 crc32_pclmul drm_display_helper ptp mei_me ghash_clmulni_intel i2c_mux snd_pcm thunderbolt pps_core mei i2c_smbus drm_buddy wmi fuse [last unloaded: snd_hda_intel]
[11904.793103] CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Not tainted 6.11.0-rc6-VLK-63048-gd39ebf112371 #1
[11904.793105] Hardware name: Intel Corporation Arrow Lake Client Platform/ARL-H Lp5x T4 RVP, BIOS MTLPFWI1.R00.4213.D81.2405221214 05/22/2024
[11904.793106] RIP: 0010:kernel_fpu_begin_mask+0xe5/0x110
[11904.793108] Code: 44 48 83 c4 10 5b c3 cc cc cc cc 48 8b 07 f6 c4 40 75 af f0 80 4f 01 40 48 81 c7 40 25 00 00 e8 81 fe ff ff eb 9c db e3 eb c7 <0f> 0b 0f 0b 65 0f b6 05 17 a9 fd 7e 84 c0 0f 84 6c ff ff ff 0f 0b
[11904.793109] RSP: 0018:fffffe00000ff9b8 EFLAGS: 00010006
[11904.793110] RAX: 0000000080110004 RBX: 0000000000000003 RCX: 0000000000000000
[11904.793111] RDX: 0000000000000002 RSI: ffff88800263cfe0 RDI: 0000000000000001
[11904.793112] RBP: fffffe00000ffa38 R08: ffffffff8263a000 R09: ffff888000000000
[11904.793112] R10: ffff888100275000 R11: 0000000000000000 R12: fffffe00000ffa40
[11904.793112] R13: fffffe00000ffa48 R14: fffffe00000ffaf0 R15: ffff8881028ba800
[11904.793113] FS:  0000000000000000(0000) GS:ffff888470100000(0000) knlGS:0000000000000000
[11904.793114] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11904.793114] CR2: 000056396e185318 CR3: 0000000109330003 CR4: 0000000000f70ef0
[11904.793115] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[11904.793115] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[11904.793116] PKRU: 55555554
[11904.793116] Call Trace:
[11904.793117]  <#MC>
[11904.793118]  ? __warn.cold+0xb1/0x145
[11904.793121]  ? kernel_fpu_begin_mask+0xe5/0x110
[11904.793122]  ? report_bug+0xea/0x170
[11904.793124]  ? handle_bug+0x3a/0x70
[11904.793126]  ? exc_invalid_op+0x17/0x70
[11904.793127]  ? asm_exc_invalid_op+0x1a/0x20
[11904.793129]  ? kernel_fpu_begin_mask+0xe5/0x110
[11904.793130]  ? kernel_fpu_begin_mask+0x23/0x110
[11904.793131]  arch_efi_call_virt_setup+0x13/0x80
[11904.793134]  virt_efi_query_variable_info_nb+0x58/0xd0
[11904.793137]  efi_query_variable_store+0x186/0x1d0
[11904.793138]  ? rcu_is_watching+0x11/0x50
[11904.793140]  ? lock_acquire+0x280/0x2f0
[11904.793141]  ? down_trylock+0x24/0x30
[11904.793142]  ? rcu_is_watching+0x11/0x50
[11904.793143]  efivar_set_variable_locked+0x9d/0xf0
[11904.793146]  efi_pstore_write+0x114/0x160
[11904.793149]  ? pstore_dump+0xe5/0x350
[11904.793152]  pstore_dump+0xe5/0x350
[11904.793154]  kmsg_dump_desc+0x97/0x190
[11904.793156]  panic+0x178/0x2b1
[11904.793158]  mce_panic+0x129/0x210
[11904.793160]  mce_timed_out+0x60/0xa0
[11904.793161]  mce_start+0x96/0x130
[11904.793162]  do_machine_check+0x995/0xad0
[11904.793164]  ? intel_idle+0x59/0xa0
[11904.793165]  exc_machine_check+0x66/0x90
[11904.793167]  asm_exc_machine_check+0x1e/0x40
[11904.793168] RIP: 0010:intel_idle+0x59/0xa0
[11904.793168] Code: 3e 0f ae f0 31 d2 48 89 f0 48 89 d1 0f 01 c8 48 8b 06 a8 08 75 14 eb 07 0f 00 2d ce 5d 30 00 b9 01 00 00 00 4c 89 c0 0f 01 c9 <f0> 80 66 02 df f0 83 44 24 fc 00 48 8b 06 a8 08 74 0b 65 81 25 b2
[11904.793169] RSP: 0018:ffffc900001dfe78 EFLAGS: 00000046
[11904.793170] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
[11904.793170] RDX: 0000000000000000 RSI: ffff888100f28040 RDI: 0000000000000001
[11904.793170] RBP: ffffe8ffffb3c140 R08: 0000000000000000 R09: 0000000000000000
[11904.793171] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff827c1740
[11904.793171] R13: ffffffff827c17c0 R14: 0000000000000001 R15: 000000000002e74c
[11904.793172]  </#MC>
[11904.793173]  <TASK>
[11904.793173]  cpuidle_enter_state+0xbd/0x540
[11904.793174]  cpuidle_enter+0x28/0x40
[11904.793178]  do_idle+0x1b9/0x210
[11904.793180]  cpu_startup_entry+0x24/0x30
[11904.793181]  start_secondary+0x11a/0x140
[11904.793183]  common_startup_64+0x13e/0x148
[11904.793186]  </TASK>
[11904.793186] irq event stamp: 710980000
[11904.793187] hardirqs last  enabled at (710979999): [<ffffffff81a40cf8>] cpuidle_enter+0x28/0x40
[11904.793188] hardirqs last disabled at (710980000): [<ffffffff81d7135b>] exc_machine_check+0x5b/0x90
[11904.793190] softirqs last  enabled at (710979990): [<ffffffff810a0153>] irq_exit_rcu+0x83/0xe0
[11904.793191] softirqs last disabled at (710979983): [<ffffffff810a0153>] irq_exit_rcu+0x83/0xe0
[11904.793192] ---[ end trace 0000000000000000 ]---
[11904.793193] ------------[ cut here ]------------
[11904.793193] WARNING: CPU: 4 PID: 0 at arch/x86/kernel/fpu/core.c:425 kernel_fpu_begin_mask+0xe7/0x110
[11904.793194] Modules linked in: dm_crypt snd_hda_codec_hdmi i915 x86_pkg_temp_thermal coretemp kvm_intel kvm snd_intel_dspcfg snd_hda_codec mei_gsc_proxy prime_numbers snd_hwdep i2c_algo_bit crct10dif_pclmul wmi_bmof e1000e ttm video snd_hda_core i2c_i801 crc32_pclmul drm_display_helper ptp mei_me ghash_clmulni_intel i2c_mux snd_pcm thunderbolt pps_core mei i2c_smbus drm_buddy wmi fuse [last unloaded: snd_hda_intel]
[11904.793200] CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Tainted: G        W          6.11.0-rc6-VLK-63048-gd39ebf112371 #1
[11904.793201] Tainted: [W]=WARN
[11904.793202] Hardware name: Intel Corporation Arrow Lake Client Platform/ARL-H Lp5x T4 RVP, BIOS MTLPFWI1.R00.4213.D81.2405221214 05/22/2024
[11904.793202] RIP: 0010:kernel_fpu_begin_mask+0xe7/0x110
[11904.793203] Code: 83 c4 10 5b c3 cc cc cc cc 48 8b 07 f6 c4 40 75 af f0 80 4f 01 40 48 81 c7 40 25 00 00 e8 81 fe ff ff eb 9c db e3 eb c7 0f 0b <0f> 0b 65 0f b6 05 17 a9 fd 7e 84 c0 0f 84 6c ff ff ff 0f 0b e9 65
[11904.793203] RSP: 0018:fffffe00000ff9b8 EFLAGS: 00010006
[11904.793204] RAX: 0000000080110004 RBX: 0000000000000003 RCX: 0000000000000000
[11904.793204] RDX: 0000000000000002 RSI: ffff88800263cfe0 RDI: 0000000000000001
[11904.793205] RBP: fffffe00000ffa38 R08: ffffffff8263a000 R09: ffff888000000000
[11904.793205] R10: ffff888100275000 R11: 0000000000000000 R12: fffffe00000ffa40
[11904.793205] R13: fffffe00000ffa48 R14: fffffe00000ffaf0 R15: ffff8881028ba800
[11904.793206] FS:  0000000000000000(0000) GS:ffff888470100000(0000) knlGS:0000000000000000
[11904.793206] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11904.793207] CR2: 000056396e185318 CR3: 0000000109330003 CR4: 0000000000f70ef0
[11904.793207] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[11904.793207] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[11904.793208] PKRU: 55555554
[11904.793208] Call Trace:
[11904.793208]  <#MC>
[11904.793209]  ? __warn.cold+0xb1/0x145
[11904.793210]  ? kernel_fpu_begin_mask+0xe7/0x110
[11904.793210]  ? report_bug+0xea/0x170
[11904.793212]  ? handle_bug+0x3a/0x70
[11904.793213]  ? exc_invalid_op+0x17/0x70
[11904.793213]  ? asm_exc_invalid_op+0x1a/0x20
[11904.793215]  ? kernel_fpu_begin_mask+0xe7/0x110
[11904.793216]  ? kernel_fpu_begin_mask+0x23/0x110
[11904.793217]  arch_efi_call_virt_setup+0x13/0x80
[11904.793218]  virt_efi_query_variable_info_nb+0x58/0xd0
[11904.793219]  efi_query_variable_store+0x186/0x1d0
[11904.793220]  ? rcu_is_watching+0x11/0x50
[11904.793221]  ? lock_acquire+0x280/0x2f0
[11904.793221]  ? down_trylock+0x24/0x30
[11904.793222]  ? rcu_is_watching+0x11/0x50
[11904.793223]  efivar_set_variable_locked+0x9d/0xf0
[11904.793225]  efi_pstore_write+0x114/0x160
[11904.793226]  ? pstore_dump+0xe5/0x350
[11904.793227]  pstore_dump+0xe5/0x350
[11904.793229]  kmsg_dump_desc+0x97/0x190
[11904.793230]  panic+0x178/0x2b1
[11904.793232]  mce_panic+0x129/0x210
[11904.793233]  mce_timed_out+0x60/0xa0
[11904.793234]  mce_start+0x96/0x130
[11904.793235]  do_machine_check+0x995/0xad0
[11904.793237]  ? intel_idle+0x59/0xa0
[11904.793238]  exc_machine_check+0x66/0x90
[11904.793240]  asm_exc_machine_check+0x1e/0x40
[11904.793240] RIP: 0010:intel_idle+0x59/0xa0
[11904.793241] Code: 3e 0f ae f0 31 d2 48 89 f0 48 89 d1 0f 01 c8 48 8b 06 a8 08 75 14 eb 07 0f 00 2d ce 5d 30 00 b9 01 00 00 00 4c 89 c0 0f 01 c9 <f0> 80 66 02 df f0 83 44 24 fc 00 48 8b 06 a8 08 74 0b 65 81 25 b2
[11904.793241] RSP: 0018:ffffc900001dfe78 EFLAGS: 00000046
[11904.793242] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
[11904.793242] RDX: 0000000000000000 RSI: ffff888100f28040 RDI: 0000000000000001
[11904.793243] RBP: ffffe8ffffb3c140 R08: 0000000000000000 R09: 0000000000000000
[11904.793243] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff827c1740
[11904.793243] R13: ffffffff827c17c0 R14: 0000000000000001 R15: 000000000002e74c
[11904.793245]  </#MC>
[11904.793245]  <TASK>
[11904.793245]  cpuidle_enter_state+0xbd/0x540
[11904.793246]  cpuidle_enter+0x28/0x40
[11904.793247]  do_idle+0x1b9/0x210
[11904.793248]  cpu_startup_entry+0x24/0x30
[11904.793249]  start_secondary+0x11a/0x140
[11904.793250]  common_startup_64+0x13e/0x148
[11904.793252]  </TASK>
[11904.793253] irq event stamp: 710980000
[11904.793253] hardirqs last  enabled at (710979999): [<ffffffff81a40cf8>] cpuidle_enter+0x28/0x40
[11904.793254] hardirqs last disabled at (710980000): [<ffffffff81d7135b>] exc_machine_check+0x5b/0x90
[11904.793255] softirqs last  enabled at (710979990): [<ffffffff810a0153>] irq_exit_rcu+0x83/0xe0
[11904.793256] softirqs last disabled at (710979983): [<ffffffff810a0153>] irq_exit_rcu+0x83/0xe0
[11904.793256] ---[ end trace 0000000000000000 ]---
[11905.755531] pstore: backend (efi_pstore) writing error (-22)



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ