lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <1434506234-610-1-git-send-email-rui.y.wang@intel.com>
Date:	Wed, 17 Jun 2015 09:57:14 +0800
From:	Rui Wang <rui.y.wang@...el.com>
To:	bp@...e.de, tony.luck@...el.com, gong.chen@...el.com
Cc:	linux-kernel@...r.kernel.org, Rui Wang <rui.y.wang@...el.com>
Subject: MCE bug?

Hi Boris & Tony,

While injecting MCEs using einj, I encountered a panic:

[    0.305697] mce: CPU supports 22 MCE banks
[    0.310288] BUG: unable to handle kernel NULL pointer dereference at 00000000                                                       00000100
[    0.319057] IP: [<ffffffff8107d0f2>] __queue_work+0x32/0x370
[    0.325398] PGD 0
[    0.327656] Oops: 0000 [#1] SMP

...

[    0.484045] Call Trace:
[    0.486780]  [<ffffffff8107d66b>] queue_work_on+0x2b/0x50
[    0.492821]  [<ffffffff8102e019>] mce_schedule_work.part.16+0x29/0x30
[    0.500020]  [<ffffffff8102f0d9>] machine_check_poll+0x249/0x260
[    0.506733]  [<ffffffff8102f123>] __mcheck_cpu_init_generic+0x33/0x100
[    0.514018]  [<ffffffff81030061>] mcheck_cpu_init+0x161/0x4b0
[    0.520443]  [<ffffffff81016095>] identify_cpu+0x365/0x450
[    0.526576]  [<ffffffff81b6144c>] identify_boot_cpu+0x10/0x7e
[    0.532994]  [<ffffffff81b614ee>] check_bugs+0x9/0x2d
[    0.538643]  [<ffffffff81b5b0a7>] start_kernel+0x469/0x495
[    0.544771]  [<ffffffff81b5aa2e>] ? set_init_arg+0x55/0x55
[    0.550900]  [<ffffffff81b5a120>] ? early_idt_handlers+0x120/0x120
[    0.557805]  [<ffffffff81b5a5ca>] x86_64_start_reservations+0x2a/0x2c
[    0.565001]  [<ffffffff81b5a709>] x86_64_start_kernel+0x13d/0x14c

It happened after the machine rebooted  (due to an injected fatal error). It tried to find leftover banks and then called mce_schedule_work() in machine_check_poll(), but it seemed too early and system_wq wasn't allocated yet, thus the NULL pointer.

Is it a known problem? I'm based on Linux 4.1.0-rc3-7.

Thanks
Rui


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ