lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170318144211.GA1014@lerouge>
Date:   Sat, 18 Mar 2017 15:42:15 +0100
From:   Frederic Weisbecker <fweisbec@...il.com>
To:     Pavel Machek <pavel@....cz>
Cc:     Alan Stern <stern@...land.harvard.edu>,
        torvalds@...ux-foundation.org,
        kernel list <linux-kernel@...r.kernel.org>,
        linux-usb@...r.kernel.org, gregkh@...uxfoundation.org,
        bhelgaas@...gle.com, linux-pci@...r.kernel.org
Subject: Re: v4.10-rc8 (-rc6) boot regression on Intel desktop, does not boot
 after cold boots, boots after reboot

On Thu, Feb 23, 2017 at 07:40:13PM +0100, Pavel Machek wrote:
> On Thu 2017-02-23 17:28:26, Frederic Weisbecker wrote:
> > On Tue, Feb 14, 2017 at 08:27:43PM +0100, Pavel Machek wrote:
> > > On Tue 2017-02-14 18:59:56, Pavel Machek wrote:
> > > > Hi!
> > > > 
> > > > > > > > Hmm. I moved keyboard between USB ports, and now 4.10-rc6 no longer
> > > > > > > > boots. v4.6 works ok. Let me try with keyboard unplugged... no, I
> > > > > > > > could not get it to work. I believe v4.9 and some v4.10-rc's worked,
> > > > > > > > but I'll have to double check.
> > > > > > > 
> > > > > > > But all the kernel versions worked when the keyboard was plugged into
> > > > > > > its original USB port?
> > > > > > 
> > > > > > Aha. So it looks difference is probably in "where is keyboard plugged
> > > > > > in" but in "reboot" vs. "cold boot". I did not do a cold boot in quite
> > > > > > a while :-(.
> > > > > > 
> > > > > > Booting to grub, then hitting ctrl-alt-del is enough to make it work. Ouch.
> > > > > > 
> > > > > > It happens with current Linus' tree.
> > > > > 
> > > > > v4.10-rc6-feb3 : broken
> > > > > v4.9 : ok
> > > > > (v4.6 : ok)
> > > > 
> > > > Hmm. It hangs during PCI fixups, and it hangs in v4.10-rc8, too.   
> > > > 
> > > > With debug patch below, I get
> > > > 
> > > > ...1d.7: PCI fixup... pass 2
> > > > ...1d.7: PCI fixup... pass 3
> > > > ...1d.7: PCI fixup... pass 3 done
> > > > 
> > > > ...followed by hang. So yes, it looks USB related.
> > > > 
> > > > (Sometimes it hangs with some kind backtrace involving secondary CPU
> > > > startup, unfortunately useful info is off screen at that point).
> > > 
> > > Forgot to say, 1d.7 is EHCI controller.
> > > 
> > > 00:1d.7 USB controller: Intel Corporation NM10/ICH7 Family USB2 EHCI
> > > Controller (rev 01)
> > 
> > Ok, I should have access soon to a EeePc 1015CX (which seem to have this controller).
> > I hope I'll be able to reproduce the issue there. If not, I'm sorry but I'll have to
> > burden you again :-)
> 
> Go through more mails. It is only reproducible after cold boot. .. so
> I doubt it will be easy to reproduce on another machine.
> 
> Now... I do have serial port, and I even might have serial cable
> somewhere, but.... Giving how sensitive it is, it is probably going to
> go away with console on ttyS...

So I had access to a machine with NM10/ICH7 chipset and I failed to reproduce.
What machine is it you're using?

I fear you're my last resort. I suspect something is programming the clockevent
behind the tick. I thought it could be the clockevents switch code but I can't find
any issue there.

I see you have CONFIG_HIGH_RES_TIMERS=n. Could you try with it enabled?

For a quick rewind:

    git reset --hard v4.10
    git revert 558e8e27e73f53f8a512485be538b07115fe5f3c

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ