lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <480F3CBC.60305@rtr.ca>
Date:	Wed, 23 Apr 2008 09:42:20 -0400
From:	Mark Lord <lkml@....ca>
To:	Jens Axboe <jens.axboe@...cle.com>
Cc:	linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
	npiggin@...e.de, torvalds@...ux-foundation.org
Subject: Re: [PATCH 1/11] Add generic helpers for arch IPI function calls

>>> Date:	Thu, 15 Nov 2007 12:07:48 -0500
>>> From: Mark Lord <lkml@....ca>
>>> To: Greg KH <gregkh@...e.de>
>>> Cc: Yasunori Goto <y-goto@...fujitsu.com>,
>>> 	Andrew Morton <akpm@...ux-foundation.org>,
>>> 	Alexey Dobriyan <adobriyan@...ru>, linux-kernel@...r.kernel.org
>>> Subject: Re: EIP is at device_shutdown+0x32/0x60
>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>> Content-Transfer-Encoding: 7bit
>>> Sender: linux-kernel-owner@...r.kernel.org
>>>
>>> ... < snip > ...
>>>
>>> Greg, I don't know if this is relevant or not,
>>> but x86 has bugs in the halt/reboot code for SMP.
>>>
>>> Specifically, in native_smp_send_stop() the code now uses
>>> spin_trylock() to "lock" the shared call buffers,
>>> but then ignores the result.
>>>
>>> This means that multiple CPUs can/will clobber each other
>>> in that code.
>>>
>>> The second bug, is that this code does not wait for the
>>> target CPUs to actually stop before it continues.
>>>
>>> This was the real cause of the failure-to-poweroff problems
>>> I was having with 2.6.23, which we fixed by using CPU hotplug
>>> to disable_nonboot_cpus() before the above code ever got run.

Jens Axboe wrote:
> On Tue, Apr 22 2008, Mark Lord wrote:
>> Jens,
>>
>> While you're in there, :)
>>
>> Could you perhaps fix this bug (above) if it still exists?
> 
> I don't understand the bug - what are the shared call buffers you are
> talking of?
> 
> With the changes, there's not even an spin_trylock() in there anymore.
> But I don't see the original bug either, so...
..

arch/x86/kernel/smp.c:

static void native_smp_send_stop(void)
{
        int nolock;
        unsigned long flags;

        if (reboot_force)
                return;

        /* Don't deadlock on the call lock in panic */
        nolock = !spin_trylock(&call_lock);       <<<<<<<<<< buggy
        local_irq_save(flags);
        __smp_call_function(stop_this_cpu, NULL, 0, 0);
        if (!nolock)
                spin_unlock(&call_lock);
        disable_local_APIC();
        local_irq_restore(flags);
}

The spinlock is trying to protect access to the global variable
"call_data" (higher up in the same file), which is used
in __smp_call_function() and friends.

But since the spinlock is ignored in this case,
the global "call_data will get clobbered if it was already in-use.

The second bug, is that for the halt case at least,
nobody waits for the other CPU to actually halt
before continuing.. so we sometimes enter the shutdown
code while other CPUs are still active.

This causes some machines to hang at shutdown,
unless CPU_HOTPLUG is configured and takes them offline
before we get here.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ