lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19f34abd0806200611m27746adao40454f420dfef31b@mail.gmail.com>
Date:	Fri, 20 Jun 2008 15:11:04 +0200
From:	"Vegard Nossum" <vegard.nossum@...il.com>
To:	"Ingo Molnar" <mingo@...e.hu>
Cc:	linux-kernel@...r.kernel.org, "Len Brown" <lenb@...nel.org>,
	linux-acpi@...r.kernel.org, "Zhao Yakui" <yakui.zhao@...el.com>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	"Alexey Starikovskiy" <astarikovskiy@...e.de>,
	"Yinghai Lu" <yhlu.kernel@...il.com>
Subject: Re: [bug, acpi] BUG: spinlock bad magic on CPU#0, swapper/1, ACPI Exception (utmutex-0263): AE_BAD_PARAMETER

Hi,

On Fri, Jun 20, 2008 at 11:52 AM, Ingo Molnar <mingo@...e.hu> wrote:
>
> -tip auto-testing started triggering this spinlock corruption message
> yesterday:
>
> [    3.976213] calling  acpi_rtc_init+0x0/0xd3
> [    3.980213] ACPI Exception (utmutex-0263): AE_BAD_PARAMETER, Thread F7C50000 could not acquire Mutex [3] [20080321]

...

> i have found the AE_BAD_PARAMETER in older logs a well, but the spinlock
> corruption is new and nothing in that area is changed by -tip so i
> suspect it's a mainline problem as well.

It seems that some acpi calls are made before acpi is even
initialized, hence the AE_BAD_PARAMETER (ACPI is trying to use
uninitialized mutexes) -- I think that may be the source of the mutex
corruption as well.

This probably happens because acpi_early_init() (which happens before
all the initcalls; mutex initialization too) returns early:

void __init acpi_early_init(void)
{
        acpi_status status = AE_OK;

        if (acpi_disabled)
                return;
...

I notice that you're booting with acpi=off, so it might be the same
problem. You could try this patch to find other callers that don't
check whether ACPI is available before using ACPI-defined mutexes:

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 235a138..5b34328 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -818,8 +818,7 @@ acpi_status acpi_os_wait_semaphore(acpi_handle handle, u32 u
        long jiffies;
        int ret = 0;

-       if (!sem || (units < 1))
-               return AE_BAD_PARAMETER;
+       BUG_ON(!sem || (units < 1));

        if (units > 1)
                return AE_SUPPORT;

(This will dump the stack instead of printing AE_BAD_PARAMETER in your
dmesg, so this is guaranteed to halt your machine given that you have
at least three of these messages in your log already!)


Vegard

-- 
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
	-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ