linux-kernel - Re: OHCI root_port_reset() deadly loop...

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sun, 07 Oct 2007 00:31:41 -0700
From:	David Brownell <david-b@...bell.net>
To:	davem@...emloft.net
Cc:	linux-usb-users@...ts.sourceforge.net,
	linux-kernel@...r.kernel.org, greg@...ah.com
Subject: Re: OHCI root_port_reset() deadly loop...

> From davem@...emloft.net  Sat Oct  6 23:56:49 2007
>
> When root_port_reset() in ohci-hub.c polls for the end of the reset,
> it puts no limit on the loop and will only exit the loop when either
> the RH_PS_PRS bit clears or the register returns all 1's (the latter
> condition is a recent addition).

The all-ones typically indicating something like CardBus eject
not preceded by a "polite" driver shutdown.

> If for some reason the bit never clears, we sit here forever and never
> exit the loop.
>
> I actually hit this on one of my machines, and I'm trying to track
> down what's happening.

Sounds like a hardware issue -- unless something else is trashing
controller state.  We've not observed that specific hardware failure
before, for what it's worth.

Is this SPARC, or is ACPI potentially in play?  PCI, or non-PCI?
Are the other ports still behaving?  Is EHCI maybe trying to switch
ownership of that port?  Is maybe the (newish) autosuspend stuff
kicking in?

The OHCI spec requires the controller to stop the reset itself.
Most silicon seems to have a built-in 10 msec timeout.

> Regardless of why my machine is doing this, there absolutely should be
> some upper bound put on the number of times we will run through this
> loop, perhaps enough such that up to 5 seconds elapses waiting for the
> reset bit to clear.  And if it times out we should print a loud
> message onto the console, but still try to continue.

Patches accepted.  :)

Since the PRS bit is specified as "write one", with writing zero
as no-effect (since the rest is hardware-timed), the only recovery
procedure might involve resetting the whole controller.  Messy,
and not something the usb core has historically handled very well.

- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/