[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200708202102.58508.david-b@pacbell.net>
Date: Mon, 20 Aug 2007 21:02:58 -0700
From: David Brownell <david-b@...bell.net>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Michal Piotrowski <michal.k.k.piotrowski@...il.com>,
linux-usb-devel@...ts.sourceforge.net, Greg KH <gregkh@...e.de>,
LKML <linux-kernel@...r.kernel.org>,
"Stuart_Hayes@...l.com" <Stuart_Hayes@...l.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Daniel Exner <dex@...gonslave.de>
Subject: Re: [linux-usb-devel] [4/4] 2.6.23-rc3: known regressions
On Monday 20 August 2007, Linus Torvalds wrote:
>
> On Mon, 20 Aug 2007, David Brownell wrote:
>
> > On Monday 13 August 2007, Michal Piotrowski wrote:
> > > Subject     : EHCI Regression in 2.6.23-rc2
> > > References    : http://lkml.org/lkml/2007/8/10/81
> > > Last known good : ?
> > > Submitter    : Daniel Exner <dex@...gonslave.de>
> > > Caused-By    : Stuart_Hayes@...l.com <Stuart_Hayes@...l.com>
> > > Â Â Â Â Â Â Â Â Â commit 196705c9bbc03540429b0f7cf9ee35c2f928a534
> > > Handled-By    : ?
> > > Status      : unknown
> >
> > Fixed I believe by Stuart's patch:
> >
> > http://marc.info/?l=linux-usb-devel&m=118765934722610&w=2
>
> Quite frankly, I'd personally prefer to just revert commit
> 196705c9bbc03540429b0f7cf9ee35c2f928a534 entirely instead.
>
> The whole dependency on cpufreq seems totally bogus. Would it not be a lot
> more natural to handle the *result* of the problem (ie the MMF errors by
> broken EHCI controllers?) rather than add totally insane workarounds for
> this case to try to hide them in the first place?
MMF basically means the "Transaction Translating" (TT) hub had data
for the host, but the host didn't collect it in time ... so that some
data was lost.
Unfortunately, that's the type of fault that's especially hard to
recover from. Plus, very few of the upper layer drivers have even
a minor clue about fault recovery strategies. And I don't trust
the current hcd/usbcore code that tries to clean up after MMF.
On the plus side, MMF errors have been vanishingly rare until this
cpufreq interaction came up ... which of course implies the downside
that those "handle the result" code paths are all but untested.
> There can be *other* delays in reading memory that have nothing to do with
> cpu frequency shifting, and everything to do with exteme situations on the
> bus. If the stupid EHCI controller has some tight latency issues, that's a
> generic problem.
There could be such problems, yes. But in practice, I don't know
that we've ever seen them. (There's a first time for everthing,
yes. I *just* fetched a webpage where an image got overwritten
about
> That commit 196705c9bbc03540429b0f7cf9ee35c2f928a534 just exemplifies what
> is wrong with USB, but it does so by adding incredibly ugly code. I'd
> rather not add even *more* ugly code - especially not for a case where we
> then seem to blame the wrong party (ie a VIA controller that didn't need
> the ugly code in the first place).
>
> Serverworks/Broadcom makes totally crap chips (not just in USB) and then
> doesn't even document their buggy crap hardware. But that is NOT a reason
> for then making the kernel have buggy crap software in it.
>
> So really - is there any reason why we just don't say "Broadcom chips
> suck, and get MMF errors under normal circumstances because they are
> crap". And from *that*, the obvious solution would seem to not be to
> penalize everybody else, but to just say that "We will try to recover from
> MMF errors gracefully by retrying the transaction". Hmm?
>
> Linus
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists