lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5200BFB3.2050202@jp.fujitsu.com>
Date:	Tue, 06 Aug 2013 18:19:47 +0900
From:	HATAYAMA Daisuke <d.hatayama@...fujitsu.com>
To:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"kexec@...ts.infradead.org" <kexec@...ts.infradead.org>
CC:	Vivek Goyal <vgoyal@...hat.com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Fenghua Yu <fenghua.yu@...el.com>,
	"H. Peter Anvin" <hpa@...or.com>, bhelgaas@...gle.com,
	jingbai.ma@...com
Subject: [Help Test] kdump, x86, acpi: Reproduce CPU0 SMI corruption issue
 after unsetting BSP flag

Hello,

I've addressing kdump restriction that there's only one cpu available
on the kdump 2nd kernel. Now I need to check if the following CPU0 SMI
corruption issue fixed in the following commit can again be reproduced
by unsetting BSP flag of the boot cpu:

commit 74b5820808215f65b70b05a099d6d3c969b82689
Author: Bjorn Helgaas <bjorn.helgaas@...com>
Date:   Wed Jul 29 15:54:25 2009 -0600

    ACPI: bind workqueues to CPU 0 to avoid SMI corruption

    On some machines, a software-initiated SMI causes corruption unless the
    SMI runs on CPU 0.  An SMI can be initiated by any AML, but typically it's
    done in GPE-related methods that are run via workqueues, so we can avoid
    the known corruption cases by binding the workqueues to CPU 0.

    References:
        http://bugzilla.kernel.org/show_bug.cgi?id=13751
        https://bugs.launchpad.net/bugs/157171
        https://bugs.launchpad.net/bugs/157691

    Signed-off-by: Bjorn Helgaas <bjorn.helgaas@...com>
    Signed-off-by: Len Brown <len.brown@...el.com>

The reason is that in the current situation, I have two ideas to deal
with the avove kdump restriction:

  1) Disable BSP at the 2nd kernel, posted at:
    [PATCH v1 0/2] x86, apic: Disable BSP if boot cpu is AP
    https://lkml.org/lkml/2012/10/16/15

  2) Unset BSP flag at the 1st kernel, suggested by Eric Biederman
     during the discussion of the idea 1).

On the idea 1), BSP is disabled on the kdump 2nd kernel. My conclusion
is that we have no method to reset BSP, i.e. recover BPS's healthy
state, while we can recover AP by means of INIT as described in MP
specification.

The idea 2) is simpler. We unset BSP flag of the boot cpu at 1st
kernel. The behaviour when receiving INIT depends on whether or not
BSP flag is set or not on its MSR; we can set and unset BSP flag of
MSR freely at runtime. (I don't mean we should).

So, next thing I should do is to evalute risk of the idea 2). In fact,
during the discussion of the idea 1), HPA pointed out that some kind
of firmware affects if BSP flag is unset. Also, maybe from the same
reason, recently introduced cpu0 hot-plugging feature by Fenghua Yu
doesn't appear to unset BSP flag.

The biggest problem next is that I don't have any machines reported in
the bugzilla articles; this issue inherently depends on firmware.

So, could anyone help testing the idea 2) above if you have which of
the following machines? (or other ones that can lead to the same bug)

- HP Compaq 6910p
- HP Compaq 6710b
- HP Compaq 6710s
- HP Compaq 6510b
- HP Compaq 2510p

I prepared a small programs for this test. See the attached file.
The steps to try to reproduce the bug is as follows:

  1. $ tar xf bsp_flag_modules.tar.gz; cd bsp_flag_modules
  2. $ make # to build these programs
  3. $ insmod unsetbspflag.ko # to unset BSP flag of the boot cpu
  4. $ insmod getcpuinfo.ko # to confirm if BSP flag of the boot cpu has
                            # been unset.
     $ dmesg | tail
  5. Close the lid of the machine.
  6. Wait some minutes if necessary.
  7. Open the lid and you can see oops on the screen if bug has
    successfully been reproduced.

-- 
Thanks.
HATAYAMA, Daisuke

Download attachment "bsp_flag_modules.tar.gz" of type "application/gzip" (9181 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ