lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080428203824.GA14044@linux-os.sc.intel.com>
Date:	Mon, 28 Apr 2008 13:38:24 -0700
From:	Venki Pallipadi <venkatesh.pallipadi@...el.com>
To:	Justin Mattock <justinmattock@...il.com>
Cc:	Bob Copeland <me@...copeland.com>, Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Venkatesh Pallipadi <venkatesh.pallipadi@...el.com>,
	hugh@...itas.com
Subject: Re: spinlock lockup on CPU#0

On Sat, Apr 26, 2008 at 09:48:55PM +0000, Justin Mattock wrote:
> On Sat, Apr 26, 2008 at 9:06 PM, Bob Copeland <me@...copeland.com> wrote:
> > On Sat, Apr 26, 2008 at 3:14 PM, Ingo Molnar <mingo@...e.hu> wrote:
> >  >  > Can you add this please, see if it triggers?
> >  >
> >  >  there's fixes pending in this area. The main fix would be the one below.
> >  >
> >  >         Ingo
> >  >
> >  >  ---------------->
> >  >  Subject: idle (arch, acpi and apm) and lockdep
> >
> >  FWIW, I was seeing the same lockdep trace with eventual hangs, and
> >  this patch (applied with some fuzz) fixed the problem.
> >
> >  --
> >  Bob Copeland %% www.bobcopeland.com
> >
> 
> Just out of curiosity I put the kernel back to it's original state,
> were the freezing occurs, then booted with nohz=off, then added
> WARN_ON(!irqs_disabled()); to sched.c only to the kernel, no other
> patches, upon rebooting
> I received different results: The screen from what I could tell was
> spitting out the spinlock messages, but instead of printing that out,
> and going on to the next task it just keep't printing, from what I
> could tell something with ehci, uhci, agpgart, ieee1394 etc... too
> fast to really make anything out, the numbers on the left side keept
> moving upward, the fans started hauling ass, I waitied a few minuetes
> hopeing this would stop
> so I can grab dmesg, but it would'nt. is there a way to use the boot
> param to write date to a file? so I could capture this event.
> regards
> 

OK. Hunted this bug down to
commit 3b22ec7b13cb31e0d87fbc0aabe14caaaad309e8

which for some reason enables interrupt in mwait_idle_with_hints(), which
eventually causes interrupts to be enabled in acpi idle call, resulting in
sched_clock_idle_wakeup_event() with interrupts enabled. This bug
was only in x86 32 bit version.

Peter's patch below which is already in git fixes this. So we don't need any
additional fixes here...

Thanks,
Venki

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ