lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1236613902.8389.675.camel@laptop>
Date:	Mon, 09 Mar 2009 16:51:42 +0100
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Frans Pop <elendil@...net.nl>
Cc:	linux-s390@...r.kernel.org,
	Hendrik Brueckner <brueckner@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...e.hu>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [BUG,2.6.29-rc7,s390] System goes into endless loop during
 boot or logon

On Mon, 2009-03-09 at 16:43 +0100, Frans Pop wrote:
> On Monday 09 March 2009, Peter Zijlstra wrote:
> > On Mon, 2009-03-09 at 02:53 +0100, Frans Pop wrote:
> > > Follow-up to an issue reported on the linux-s390 list, seen in the
> > > Hercules S/390 emulator.
> > >
> > > On Sunday 08 March 2009, Frans Pop wrote:
> > > > Well, not quite. It does boot successfully and I do get a login
> > > > prompt. I can also login on the console or connect with SSH, but in
> > > > both cases the system again gets into some loop before I actually
> > > > get a shell prompt.
> > >
> > > During the bisection series the system would sometimes enter the loop
> > > during the boot procedure, before I tried to logon. After it enters
> > > the loop one processor just goes racing at 100%.
> >
> > Where? Do you have NMI watchdog output, or even sysrq-t?
> 
> Hmmm. Your commit log message for ca109491f612aab5c8152207631c0444f63da97f 
> does explicitly mention the risk of an infinite loop, as does a comment 
> in hrtimer_enqueue_reprogram().
> 
> Any chance the cause is there? Any way to test for that?

a6037b61c2f5fc99c57c15b26d7cfa58bbb34008 should have fixed the mentioned
issue (along with the deadlock mentioned in the changelog).

The issue was that you could enqueue an expired timer, run it in place,
enqueue it again, etc..

The current code would not run it in place, but instead fire a softirq
to handle it. That opens up a preemption window.

Note, this can only happen with HRTIMER_RESTART timers, and those should
be careful to avoid hogging the CPU anyway.

Doesn't this s390 thing have a sysrq key you can press to get some
traces out?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ