lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.01.0910120725350.3438@localhost.localdomain>
Date:	Mon, 12 Oct 2009 08:24:43 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	David Woodhouse <david.woodhouse@...el.com>
cc:	"Rafael J. Wysocki" <rjw@...k.pl>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Adrian Bunk <bunk@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Natalie Protasevich <protasnb@...il.com>
Subject: Re: 2.6.32-rc4: Reported regressions from 2.6.31



On Mon, 12 Oct 2009, David Woodhouse wrote:
> 
> Well, according to the design, the IOMMU code is doing the right thingĀ¹.
> 
> The theory is that the BIOS _tells_

There is no "theory". There is only crap BIOSes. Stop living in a dream 
world, and stop making arguments that are only relevant in that dream 
world.

> The only viable solution (short of an open source BIOS written by sober
> people)

Again, you're living in that dream world. Wake up, sheeple.

BIOS writers write crap, because it's a crap job. It's that simple. Yes, 
they're probably drunk or drugged up, but they need it to deal with the 
hand they have been dealt.

There are absolutely _zero_ incentives to write good firmware for any 
particular device (and nothing else matters), because the projects are all 
(a) largely "throw-over-the-fence-and-forget-about-it" for any particular 
machine (ie any fixes are mostly relevant for the _next_ generation of 
machines), and (b) they have no actual user incentives or feedback, since 
nobody really "runs" the firmware anyway - it's supposed to be invisible 
by design and just sets things up.

So outside of the testing that it gets (_before_ it ever hits any users 
hands) there is never _ever_ going to be any QA.  Once it is in user 
hands, it is what it is. Almost nobody upgrades their firmware in 
practice. Yeah, geeks may do. Normal people? Not so much.

And that would _not_ change even with an open source BIOS. Why? Because 
the _code_ generally isn't the problem. The problem tends to be all the 
localization for any particular machine.

Even if the code was open source and perfect, we'd still have trouble with 
firmware - because 99% of it is driven by tables that by design have to be 
changed for each machine (yes, yes, a BIOS is easily a megabyte of 
compressed code too, but a lot of it is the nice GUI for setup etc - the 
actual code that runs on _bootup_ is rather small, and is all about those 
tables).

We might have a few _fewer_ problems if the code was perfect, and maybe 
we'd have missed this particular issue (but I doubt it, it's still a 
localized table). But it wouldn't really change any of the fundamentals in 
the issue.

BTW, in this case, it's not even necessarily the firmware people who are 
to blame. It's a combination of (a) USB is crazy polled DMA and (b) IOMMU 
is a whole new thing and (c) ACPI is too f*cking complex so nobody will 
_ever_ get it right anyway, even if the firmware people didn't have 
everything against them.

So your arguments about RMRR tables and "should do" are POINTLESS. Just 
give it up.

The sane thing to do is to have a legacy IOMMU mapping until all devices 
are initialized, so that things work _despite_ the inevitable BIOS 
crapola. End of story.

So stop blaming the BIOS. We _know_ firmware is crap - there is no point 
in blaming it. The response to "firmware bug" should be "oh, of course - 
and our code was too fragile, since it didn't take that into account".

And stop saying these problems would magically go away with open-source 
firmware. That just shows that you don't understand the realities of the 
situation. Even an open-source bios would end up having buggy tables, and 
even with an opensource bios, users generally wouldn't upgrade it.

"If wishes were horses, beggars would ride"

> was to quiesce the USB controllers before enabling the IOMMU.

The other solution would be to just _enable_ (and do all the setup) of the 
IOMMU early. And then just leave a legacy mapping for the IOMMU, and then 
_after_all_devices_are_set_up_ can you then remove the legacy mapping.

I don't much care. Just as long as the DMA works for the whole bus setup.

> The final PCI quirks are currently run from pci_init() which is a
> device_initcall(), which is too late -- in fact, it could actually be
> _after_ some of the real device drivers have taken control of the same
> hardware.

Now, this I can certainly agree 100% with. We can and should move the 
final quirks up a bit. And yes, I absolutely think we should at least 
_guarantee_ that those quirks are run before any other device_initcalls, 
regardless of any IOMMU issues (ie now it looks like it depends on link 
ordering whether a built-in driver might run before or after pci_init()).

So I agree that we may well be able to fix this by changing the ordering. 
What I disagree with is your continual "I wish things were different". 
I've seen you make that "open source firmware" argument before. It's 
pointless. Stop doing it.

But your patch looks fine.

That said, I think your choice of initcall is odd, even if it is 
understandable. Right now we have, at least on x86:

	fs_initcall(pcibios_assign_resources);

and I assume you picked fs_initcall_sync() so that it happens after that 
one. No?

Which makes sense, but at the same time, it all looks just random. And 
different architectures actually do it in different places (some seem to 
do it inside pcibios_init() at subsys_initcall time). So I'm not even sure 
fs_initcall_sync() will do it for other architectures, although it looks 
like most others do their final PCI setup _earlier_ rather than later.

I'm wondering if we should not get _rid_ of that whole pci_init() (you 
renamed it, and I agree - it's clearly not really "pci_init()"), and move 
it to just be the last thing that pci_subsys_init() does, or perhaps all 
the way into pcibios_resource_survey().

But I guess there could be random ACPI initcalls etc involved too, and 
subtle ordering constraints with _those_. And we have way too many 
arch-specific details here. So your patch may be the simplest one, but I 
wish we could also make some of this be less of a jungle of different 
initcalls.

Basically, it seems silly to have this kind of subtlety for just the final 
quirk, when the _other_ quirks are all handled by generic code in very 
well-defined places.

For example, one of the effects of this mess is that as far as I can tell, 
PCI hotplug (or cardbus, which is a special case of PCI hotplug) will 
never run the pci_fixup_late fixups at all, even though it _will_ run the 
pci_fixup_header/early/enable ones (because those are done by generic 
code: pci_setup_device(), pci_device_add() and pci_enable_device()).

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ