lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200618102135.GA616959@hirez.programming.kicks-ass.net>
Date:   Thu, 18 Jun 2020 12:21:35 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Ilkka Prusi <ilkka.prusi@...inet.fi>
Cc:     linux-kernel@...r.kernel.org, riel@...riel.com, luto@...nel.org,
        x86@...nel.org
Subject: Re: WARNING at switch_mm_irqs_off, followed by frozen machine

On Wed, Jun 17, 2020 at 02:32:39AM +0300, Ilkka Prusi wrote:
> Hi,
> 
> Yesterday my computer with kernel version 5.7.2 was frozen badly enough that
> hard reset was necessary (did not react to SysRq keys). Upon checking logs I
> found following warning and information from the time just before resetting
> it.
> 
> Computer has AMD Ryzen 7 2700, Asus B450 motherboard.
> 
> 5.8-rc1 encountered BUG() and did not boot (iommu and smp_processor_id()
> called from wrong context, I'll see if I can catch log somehow).

Since you're building your own kernels, can you make sure to always
have:

 - CONFIG_DEBUG_BUGVERBOSE=y
 - CONFIG_DEBUG_INFO=y

and then when reporting, run the thing through:

  ./scripts/decode_stacktrace.sh

Because the below splat is _VERY_ hard to decipher because of all the
missing information.

> kernel: [ 1166.739530] ------------[ cut here ]------------
> kernel: [ 1166.739539] WARNING: CPU: 4 PID: 0 at
> switch_mm_irqs_off+0x3ef/0x4a0

With BUGVERBOSE the above would've included a __LINE__ number which
would've easily identified which WARN went *splat*, but now..

> kernel: [ 1166.739540] Modules linked in: nf_tables(E) nfnetlink(E)
> binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) snd_hda_cod
> ec_realtek(E) snd_hda_codec_generic(E) ledtrig_audio(E)
> snd_hda_codec_hdmi(E) snd_usb_audio(E) snd_hda_intel(E) snd_intel_dspcfg(E)
> snd_usbmidi_lib(E) snd_hda_cod
> ec(E) snd_rawmidi(E) snd_hda_core(E) snd_hwdep(E) snd_seq_device(E) mc(E)
> snd_pcm_oss(E) amdgpu(E) snd_mixer_oss(E) gpu_sched(E) snd_pcm(E) ttm(E)
> snd_timer(E) jo
> ydev(E) drm_kms_helper(E) snd(E) cec(E) soundcore(E) eeepc_wmi(E)
> asus_wmi(E) fb_sys_fops(E) syscopyarea(E) battery(E) edac_mce_amd(E)
> sparse_keymap(E) sysfillrec
> t(E) sysimgblt(E) i2c_algo_bit(E) kvm_amd(E) video(E) kvm(E) irqbypass(E)
> 8250(E) ccp(E) wmi_bmof(E) sg(E) 8250_base(E) r8169(E) serial_mctrl_gpio(E)
> rng_core(E)
> crct10dif_pclmul(E) realtek(E) serial_core(E) crc32_pclmul(E) libphy(E)
> ghash_clmulni_intel(E) aesni_intel(E) wmi(E) i2c_piix4(E) k10temp(E)
> crypto_simd(E) cryptd
> (E) efi_pstore(E) glue_helper(E) efivars(E) acpi_cpufreq(E) button(E)
> kernel: [ 1166.739583]  nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E)
> parport_pc(E) drm(E) sunrpc(E) ppdev(E) lp(E) agpgart(E) pa
> rport(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) raid10(E)
> raid456(E) libcrc32c(E) async_raid6_recov(E) async_memcpy(E) async_pq(E)
> async_xor(E) xor(E) async_tx(E) raid6_pq(E) raid1(E) raid0(E) multipath(E)
> linear(E) md_mod(E) evdev(E) input_leds(E) hid_generic(E) usbhid(E) hid(E)
> crc32c_intel(E) xhci_pci(E) xhci_hcd(E) sd_mod(E) usbcore(E) t10_pi(E)
> gpio_amdpt(E) gpio_generic(E)
> kernel: [ 1166.739613] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G           
> E     5.7.2 #2
> kernel: [ 1166.739614] Hardware name: System manufacturer System Product
> Name/TUF B450-PLUS GAMING, BIOS 2008 12/06/2019
> kernel: [ 1166.739618] RIP: 0010:switch_mm_irqs_off+0x3ef/0x4a0
> kernel: [ 1166.739621] Code: 00 00 0f 84 be fc ff ff b9 49 00 00 00 b8 01 00
> 00 00 31 d2 0f 30 48 8b 86 68 03 00 00 65 48 89 05 fe 9e fb 7e e9 9c fc ff
> ff <0f> 0b f0 4c 0f ab ad 00 04 00 00 e9 3b fe ff ff b9 49 00 00 00 b8

I had to run this through ./scripts/decodecode, to obtain:

All code
========
   0:	00 00                	add    %al,(%rax)
   2:	0f 84 be fc ff ff    	je     0xfffffffffffffcc6
   8:	b9 49 00 00 00       	mov    $0x49,%ecx
   d:	b8 01 00 00 00       	mov    $0x1,%eax
  12:	31 d2                	xor    %edx,%edx
  14:	0f 30                	wrmsr  
  16:	48 8b 86 68 03 00 00 	mov    0x368(%rsi),%rax
  1d:	65 48 89 05 fe 9e fb 	mov    %rax,%gs:0x7efb9efe(%rip)        # 0x7efb9f23
  24:	7e 
  25:	e9 9c fc ff ff       	jmpq   0xfffffffffffffcc6
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	f0 4c 0f ab ad 00 04 	lock bts %r13,0x400(%rbp)
  33:	00 00 
  35:	e9 3b fe ff ff       	jmpq   0xfffffffffffffe75
  3a:	b9 49 00 00 00       	mov    $0x49,%ecx
  3f:	b8                   	.byte 0xb8

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	f0 4c 0f ab ad 00 04 	lock bts %r13,0x400(%rbp)
   9:	00 00 
   b:	e9 3b fe ff ff       	jmpq   0xfffffffffffffe4b
  10:	b9 49 00 00 00       	mov    $0x49,%ecx
  15:	b8                   	.byte 0xb8

And then pattern match that against my local defconfig build of
arch/x86/mm/tlb.o and pray my compiler did anyting like your
(unspecified) compiler.

Going by that, I ended up with:

		/*
		 * Even in lazy TLB mode, the CPU should stay set in the
		 * mm_cpumask. The TLB shootdown code can figure out from
		 * from cpu_tlbstate.is_lazy whether or not to send an IPI.
		 */
		if (WARN_ON_ONCE(real_prev != &init_mm &&
				 !cpumask_test_cpu(cpu, mm_cpumask(next))))
			cpumask_set_cpu(cpu, mm_cpumask(next));

being the WARN that went splat. I've Cc'ed both Rik and Andy who are
responsible for those lines. But this is all very little to go on.

> kernel: [ 1166.739623] RSP: 0018:ffffc90000147e48 EFLAGS: 00210086
> kernel: [ 1166.739624] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000000000
> kernel: [ 1166.739626] RDX: ffff88871e43c001 RSI: ffff88871e43c0c0 RDI:
> ffff8886fdede440
> kernel: [ 1166.739627] RBP: ffff8886fdede440 R08: 0000000000000001 R09:
> ffff8887f3b5d140
> kernel: [ 1166.739628] R10: 000000000000025f R11: 0000000000000018 R12:
> ffff8886fdede440
> kernel: [ 1166.739629] R13: 0000000000000004 R14: ffff88871e43c0c0 R15:
> 0000000000000000
> kernel: [ 1166.739631] FS:  0000000000000000(0000) GS:ffff8887fe900000(0000)
> knlGS:0000000000000000
> kernel: [ 1166.739632] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> kernel: [ 1166.739633] CR2: 00003d9b09447000 CR3: 000000071e4e0000 CR4:
> 00000000001406e0
> kernel: [ 1166.739635] Call Trace:
> kernel: [ 1166.739643]  __schedule+0x1cd/0x5e0
> kernel: [ 1166.739647]  schedule_idle+0x28/0x40
> kernel: [ 1166.739651]  do_idle+0x149/0x1f0
> kernel: [ 1166.739655]  cpu_startup_entry+0x19/0x20
> kernel: [ 1166.739659]  start_secondary+0x170/0x1c0
> kernel: [ 1166.739663]  secondary_startup_64+0xa4/0xb0
> kernel: [ 1166.739667] ---[ end trace 8c88a1b8567786da ]---
> 
> --
> 
>  - Ilkka Prusi
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ