linux-kernel - Re: BUG: scheduling while atomic in acpi_ps_complete

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <19f34abd0908240750q454a684fy9aaf4a1cb596c4ce@mail.gmail.com>
Date:	Mon, 24 Aug 2009 16:50:08 +0200
From:	Vegard Nossum <vegard.nossum@...il.com>
To:	Eric Paris <eparis@...hat.com>
Cc:	Alexey Starikovskiy <astarikovskiy@...e.de>,
	linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org,
	len.brown@...el.com
Subject: Re: BUG: scheduling while atomic in acpi_ps_complete_op

2009/8/24 Eric Paris <eparis@...hat.com>:
> On Sat, 2009-08-22 at 01:24 +0400, Alexey Starikovskiy wrote:
>> Eric Paris пишет:
>> > On Sat, 2009-08-22 at 00:12 +0400, Alexey Starikovskiy wrote:
>> >> Hi,
>> >> This should be handled by abe1dfab60e1839d115930286cb421f5a5b193f3.
>> >
>> > And yet I'm getting it from linux-next today.
>> >
>> > So you are apparently failing the in_atomic_preempt_off() test but
>> > succeeding in your !irqs_disabled() test.
>> >
>> > Something isn't right since I'm hitting it hundreds of times on boot.
>> >
>> > -Eric
>> >
>> Ok, let's see if replacing irqs_disabled() to
>> in_atomic_preempt_off() helps...
>
> It does stop my slew of warnings.  Not sure it completely fixes my
> problems though....
>
> [    1.897021] ... counter mask:            0000000700000003^M
> [    1.906821] ACPI: Core revision 20090625^M
> [   10.000008] INFO: RCU detected CPU 0 stall (t=10000 jiffies)^M
> [   10.000008] sending NMI to all CPUs:^M
> [   21.907580] Setting APIC routing to flat^M
> [   21.973314] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1^M
> [   21.985260] CPU0: Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz stepping 07^M
> [   21.992017] kmemcheck: Limiting number of CPUs to 1.^M
> [   21.993065] kmemcheck: Initialized^M
> [   22.750118] Brought up 1 CPUs^M
> [   22.751069] Total of 1 processors activated (5333.45 BogoMIPS).^M
> [   23.493639] khelper used greatest stack depth: 4848 bytes left^M
> [   24.999193] Booting paravirtualized kernel on bare hardware^M
> [   25.265364] Time: 17:50:52  Date: 08/21/09^M
> [   25.616191] NET: Registered protocol family 16^M
> [   27.765113] ACPI: bus type pci registered^M
> [   28.795307] PCI: Using configuration type 1 for base access^M
> [   61.793279] bio: create slab <bio-0> at 0^M
> [   95.285367] ACPI: BIOS _OSI(Linux) query ignored^M
> [  102.628227] ACPI: Interpreter enabled^M
> [  102.630134] ACPI: (supports S0 S1 S5)^M
> [  102.823225] ACPI: Using IOAPIC for interrupt routing^M
> [  142.365090] ACPI: No dock devices found.^M
> [  156.864036] ACPI: PCI Root Bridge [PCI0] (0000:00)^M
> [  157.460654] pci 0000:00:07.3: quirk: region 1000-103f claimed by PIIX4 ACPI^M
> [  157.463937] pci 0000:00:07.3: quirk: region 1040-104f claimed by PIIX4 SMB^M
> [  157.644036] pci 0000:00:11.0: transparent bridge^M
> [  193.009036] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 11 14 15) *0, disabled.^M
> [  193.938036] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 14 15)^M
> [  194.864036] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 14 15)^M
> [  195.780036] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 11 14 15) *0, disabled.^M
>
> Something took 20 seconds between "ACPI: Core revision 20090625" and
> "Setting APIC routing to flat"
>
> This is a linux-next kernel, on vmware-server, with kmemcheck enabled.
> Disabling kmemcheck seems to make all of this go away.  If not the ACPI
> guys who should I be talking to?
>
> A little bit later I finally see backtraces from NMIs because of RCU
> stalls.  Anyone have ideas here?
>
> [  213.168161] INFO: RCU detected CPU 0 stall (t=10004 jiffies)^M

So this is probably just the intrinsic slowness of kmemcheck that
causes the the big delays and RCU stalls. It shouldn't cause any other
badness, as far as I understood, the 10000 jiffies limit is just a
heuristic. Maybe we need to adjust it when kmemcheck is enabled.

I'm more confused about the change you had to with
irqs_disabled()/in_atomic_preempt_off().


Vegard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/