lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <a1fd0d9c6c8cd90a74879b61467ae48d@natalenko.name>
Date:   Sun, 15 Nov 2020 23:32:47 +0100
From:   Oleksandr Natalenko <oleksandr@...alenko.name>
To:     linux-rt-users@...r.kernel.org
Cc:     linux-kernel@...r.kernel.org, tglx@...utronix.de,
        peterz@...radead.org, rostedt@...dmis.org
Subject: WARNING at kernel/sched/core.c:2013 migration_cpu_stop+0x2e3/0x330

Hi.

I'm running v5.10-rc3-rt7 for some time, and I came across this splat in 
dmesg:

```
[118769.951010] ------------[ cut here ]------------
[118769.951013] WARNING: CPU: 19 PID: 146 at kernel/sched/core.c:2013 
migration_cpu_stop+0x2e3/0x330
[118769.951018] Modules linked in: uinput uas usb_storage 
blocklayoutdriver xt_mark ip6table_nat ip6table_filter ip6_tables rfcomm 
fuse rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace 
sunrpc nfs_ssc fscache iptable_nat xt_MASQUERADE nf_nat iptable_filter 
xt_comment nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 cmac 
algif_hash algif_skcipher nf_tables af_alg snd_hda_codec_realtek nct6775 
bnep tun nfnetlink hwmon_vid snd_hda_codec_generic ledtrig_audio 
snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg iwlmvm soundwire_intel 
soundwire_generic_allocation soundwire_cadence nls_iso8859_1 nls_cp437 
vfat edac_mce_amd snd_hda_codec fat eeepc_wmi snd_hda_core mac80211 
uvcvideo kvm_amd asus_wmi libarc4 soundwire_bus btusb videobuf2_vmalloc 
btrtl videobuf2_memops battery btbcm sparse_keymap wmi_bmof 
snd_usb_audio mxm_wmi videobuf2_v4l2 btintel snd_usbmidi_lib 
videobuf2_common snd_hwdep iwlwifi snd_soc_core snd_rawmidi bluetooth 
kvm videodev snd_seq_device snd_compress joydev
[118769.951047]  ecdh_generic ac97_bus irqbypass ecc snd_pcm_dmaengine 
input_leds mousedev mc crc16 r8169 rapl cfg80211 realtek sp5100_tco 
snd_pcm mdio_devres of_mdio k10temp i2c_piix4 snd_timer rfkill fixed_phy 
ipmi_devintf igb snd ipmi_msghandler libphy dca soundcore evdev tpm_crb 
mac_hid tpm_tis tpm_tis_core pinctrl_amd wmi acpi_cpufreq tcp_bbr 
vhost_vsock vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock 
msr crypto_user ip_tables x_tables xfs dm_thin_pool dm_persistent_data 
dm_bio_prison dm_bufio libcrc32c crc32c_generic dm_crypt cbc 
encrypted_keys trusted tpm hid_logitech_hidpp hid_logitech_dj 
hid_generic usbhid hid dm_mod raid10 md_mod crct10dif_pclmul 
crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd 
xhci_pci xhci_pci_renesas ccp cryptd ehci_pci glue_helper xhci_hcd 
ehci_hcd rng_core amdgpu gpu_sched ttm i2c_algo_bit drm_kms_helper 
syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm agpgart
[118769.951079] CPU: 19 PID: 146 Comm: migration/19 Not tainted 
5.10.0-pf0 #1
[118769.951080] Hardware name: System manufacturer System Product 
Name/Pro WS X570-ACE, BIOS 2311 10/16/2020
[118769.951081] Stopper: migration_cpu_stop+0x0/0x330 <- 
affine_move_task+0x42f/0x620
[118769.951083] RIP: 0010:migration_cpu_stop+0x2e3/0x330
[118769.951084] Code: ff ff 31 db 45 85 ed 0f 89 65 ff ff ff 8b b5 d0 0a 
00 00 4c 89 ff e8 cc 43 ff ff 0f b6 d8 66 85 db 75 d8 0f 0b e9 f2 fd ff 
ff <0f> 0b e9 eb fd ff ff 44 89 ee 4c 89 ff e8 ab 43 ff ff 84 c0 0f 84
[118769.951085] RSP: 0018:ffffb58c806c7e50 EFLAGS: 00010046
[118769.951086] RAX: ffffa136e7a9c300 RBX: 0000000000000000 RCX: 
0000000000000000
[118769.951086] RDX: 000000000000000d RSI: 0000000000000013 RDI: 
ffffa136e7a9bf80
[118769.951087] RBP: ffffa13d0eee99c0 R08: 000000000000002f R09: 
ffffa13d0ef29af0
[118769.951087] R10: 00000000000000ec R11: 000000000000016a R12: 
ffffb58c9118fdd0
[118769.951088] R13: 00000000ffffffff R14: ffffa136e7a9c820 R15: 
ffffa136e7a9bf80
[118769.951089] FS:  0000000000000000(0000) GS:ffffa13d0eec0000(0000) 
knlGS:0000000000000000
[118769.951089] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[118769.951090] CR2: 00007f687474a000 CR3: 000000021319c000 CR4: 
0000000000350ee0
[118769.951090] Call Trace:
[118769.951094]  ? set_cpus_allowed_ptr+0x10/0x10
[118769.951095]  cpu_stopper_thread+0x89/0x130
[118769.951097]  ? smpboot_register_percpu_thread+0xe0/0xe0
[118769.951099]  smpboot_thread_fn+0x1d8/0x2c0
[118769.951100]  kthread+0x190/0x1b0
[118769.951101]  ? __kthread_init_worker+0x50/0x50
[118769.951102]  ret_from_fork+0x22/0x30
[118769.951105] CPU: 19 PID: 146 Comm: migration/19 Not tainted 
5.10.0-pf0 #1
[118769.951106] Hardware name: System manufacturer System Product 
Name/Pro WS X570-ACE, BIOS 2311 10/16/2020
[118769.951106] Stopper: migration_cpu_stop+0x0/0x330 <- 
affine_move_task+0x42f/0x620
[118769.951107] Call Trace:
[118769.951108]  dump_stack+0x6d/0x88
[118769.951111]  __warn.cold+0x24/0x3d
[118769.951113]  ? migration_cpu_stop+0x2e3/0x330
[118769.951114]  report_bug+0xd1/0x100
[118769.951116]  handle_bug+0x3a/0xa0
[118769.951118]  exc_invalid_op+0x15/0xd0
[118769.951119]  asm_exc_invalid_op+0x12/0x20
[118769.951121] RIP: 0010:migration_cpu_stop+0x2e3/0x330
[118769.951122] Code: ff ff 31 db 45 85 ed 0f 89 65 ff ff ff 8b b5 d0 0a 
00 00 4c 89 ff e8 cc 43 ff ff 0f b6 d8 66 85 db 75 d8 0f 0b e9 f2 fd ff 
ff <0f> 0b e9 eb fd ff ff 44 89 ee 4c 89 ff e8 ab 43 ff ff 84 c0 0f 84
[118769.951122] RSP: 0018:ffffb58c806c7e50 EFLAGS: 00010046
[118769.951123] RAX: ffffa136e7a9c300 RBX: 0000000000000000 RCX: 
0000000000000000
[118769.951123] RDX: 000000000000000d RSI: 0000000000000013 RDI: 
ffffa136e7a9bf80
[118769.951124] RBP: ffffa13d0eee99c0 R08: 000000000000002f R09: 
ffffa13d0ef29af0
[118769.951124] R10: 00000000000000ec R11: 000000000000016a R12: 
ffffb58c9118fdd0
[118769.951124] R13: 00000000ffffffff R14: ffffa136e7a9c820 R15: 
ffffa136e7a9bf80
[118769.951126]  ? set_cpus_allowed_ptr+0x10/0x10
[118769.951127]  cpu_stopper_thread+0x89/0x130
[118769.951128]  ? smpboot_register_percpu_thread+0xe0/0xe0
[118769.951129]  smpboot_thread_fn+0x1d8/0x2c0
[118769.951130]  kthread+0x190/0x1b0
[118769.951130]  ? __kthread_init_worker+0x50/0x50
[118769.951131]  ret_from_fork+0x22/0x30
[118769.951133] ---[ end trace 0000000000000002 ]---
```

which corresponds to the following condition:

```
2007         /*
2008          * When this was migrate_enable() but we no longer have an
2009          * @pending, a concurrent SCA 'fixed' things and we should 
be
2010          * valid again. Nothing to do.
2011          */
2012         if (!pending) {
2013             WARN_ON_ONCE(!is_cpu_allowed(p, cpu_of(rq)));
2014             goto out;
2015         }
```

I'm not sure what triggered this, and the system still looks usable 
afterwards. I have no idea how to trigger it again ATM, so this is just 
a heads up in case you know what could go wrong.

Thanks.

-- 
   Oleksandr Natalenko (post-factum)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ