lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130508140342.GA8152@phenom.dumpdata.com>
Date:	Wed, 8 May 2013 10:03:42 -0400
From:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To:	Borislav Petkov <bp@...en8.de>
Cc:	linux-kernel@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
	hpa@...or.com, x86@...nel.org, fenghua.yu@...el.com,
	xen-devel@...ts.xensource.com
Subject: Re: v3.9 - CPU hotplug and microcode earlier loading hits a mutex
 deadlock (x86_cpu_hotplug_driver_mutex)

On Wed, May 08, 2013 at 02:54:14PM +0200, Borislav Petkov wrote:
> On Tue, May 07, 2013 at 03:00:24PM -0400, Konrad Rzeszutek Wilk wrote:
> > I dug deeper in how QEMU does it and it looks to be actually doing
> > the right thing. It triggers the ACPI SCI, the method that figures
> > out the CPU online/offline bits kicks off the right OSPM notification
> > and everything is going through ACPI (so _STA is on the processor is
> > checked, returns 0x2 (ACPI_STA_DEVICE_PRESENT), MADT has now the CPU
> > marked as enabled).
> 
> AFAIUC, you mean physical hotplug here which is done with ACPI, right?

Yes.

> And, if so, we actually need an x86 machine which supports that to test
> it on. Also, is this how physical hotplug is done? Put the number of
> max. supported CPUs in MADT and those which are not present are marked
> as disabled?
> 
> Then, when they're physically hotplugged, ACPI marks them as enabled?

Yes. The GPE is raised (by QEMU), the ACPI Method PRSC is invoked:

 Scope (_GPE)                                                                
    {                                                                           
        Method (_L02, 0, NotSerialized)                                         
        {                                                                       
            Return (\_SB.PRSC ())                                               
        }                                                                       
    }                                                                           
                                

Which iterates over the AXF00 (32 bytes) and checks each bit to see if
it is enabled (so CPU is on) or disabled. Then if it is different
from the MADT.FLG entry (so the 'flags' entry in the MADT), it updates
the MADT entry to have one (or zero if it has been disabled). And then
Notifies the Processor. Here is what the Processor entry looks like:

  Processor (PR02, 0x02, 0x0000B010, 0x06)                                
        {                                                                       
            Name (_HID, "ACPI0007")                                             
            OperationRegion (MATR, SystemMemory, Add (MAPA, 0x10), 0x08)        

[MAPA is the physical address to the MADT, the 0x10 increases by eight
bytes for each CPU]

            Field (MATR, ByteAcc, NoLock, Preserve)                             
            {                                                                   
                MAT,    64                                                      
            }                                                                   

[so it is 64 bits, the MAT is used in the '_STA' method to return the whole
contents of said memory location]
                                                                                
            Field (MATR, ByteAcc, NoLock, Preserve)                             
            {                                                                   
                        Offset (0x04),                                          
                FLG,    1                                                       
            }     
[and FLG is at offset 4 (out of 8 bytes), which means it lands on the
lapic->flags entry]                                                              
                                       
The PRSC method does what I mentioned above.
                                                                           
        OperationRegion (PRST, SystemIO, 0xAF00, 0x20)                          
        Field (PRST, ByteAcc, NoLock, Preserve)                                 
        {                                                                       
            PRS,    15                                                          
        }                                                                       
                                                                                
        Method (PRSC, 0, NotSerialized)                                         
        {                                                                       
            Store (ToBuffer (PRS), Local0)                                      
[Local0 has now the 32 bytes of data]

            Store (DerefOf (Index (Local0, Zero)), Local1)                      

[Local1 has now the zero-th byte of the 32-bytes. Each bit is one CPU, so
it contains the value of eight CPUs]

            And (Local1, One, Local2)                                           

[Local2 = gpe_state.cpu_sts[i] & 1, aka first CPU]

            If (LNotEqual (Local2, ^PR00.FLG))                                  
            {                                                                   
                Store (Local2, ^PR00.FLG)                                       

[Write the bit in the PR00.FLG, so at offset four in the MADT]

                If (LEqual (Local2, One))                                       
                {                                                               
[If it was enabled, and now is disabled, then notify with 1]
                    Notify (PR00, One)                                          
                    Subtract (MSU, One, MSU)                                    
[fix up the checksum]
                }                                                               
                Else                                                            
                {                                                               
                    Notify (PR00, 0x03)                                         
[if it was disabled, and now enabled, then notify with 0x3]
                    Add (MSU, One, MSU)                    
[again, fix up the checksum]
                     
                }                                                               
            }                                                                   
                                                                                
            ShiftRight (Local1, One, Local1)              

[here it shifts and continues on testing each CPU bit]

> Questions over questions...?

I probably went overboard with my answers :-)
> 
> > I am now 99% sure you would be able to reproduce this on baremetal with
> > ACPI hotplug where the CPUs at bootup are marked as disabled in MADT.
> > (lapic->lapic_flags == 0).
> > 
> > The comment for calling save_mc_for_early says:
> 
> Looks like save_mc_for_early would need another, local mutex to fix that.

Let me try that. Thanks for the suggestion.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> Sent from a fat crate under my desk. Formatting is fine.
> --
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ