lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 29 Oct 2007 17:52:18 -0600
From:	Robert Hancock <hancockr@...w.ca>
To:	Greg KH <greg@...ah.com>
Cc:	Jesse Barnes <jbarnes@...tuousgeek.org>, akpm@...ux-foundation.org,
	ak@...e.de, rajesh.shah@...el.com, torvalds@...ux-foundation.org,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

Greg KH wrote:
> On Fri, Oct 26, 2007 at 09:59:45AM -0700, Jesse Barnes wrote:
>> On Thursday, October 25, 2007 7:54 pm Greg KH wrote:
>>> On Thu, Oct 25, 2007 at 04:22:35PM -0700, Jesse Barnes wrote:
>>>> I think Greg doesn't like it, even though we don't have an
>>>> alternative at this point...
>>> Yes, I didn't like it, Ivan didn't like it, and I got reports that it
>>> wasn't even needed at all once you upgraded your BIOS to the latest
>>> version.
>>>
>>> So, is this still needed?  And if so, can you try to implement what
>>> Ivan suggested to do here instead?
>> Yes, it's still needed.  Auke rescinded his "BIOS upgrade makes it work" 
>> message, so something like this is still necessary.
> 
> He did?  Ugh, I can't keep these all straight, sorry.
> 
> Can someone just send what they think is still needed, and explain why
> Ivan will not object to it?  :)

Here's a recap of the whole issue just for people's information:

Right now we disable MMCONFIG on machines where the MCFG area is not 
reserved in the E820 memory map since we figure it's not valid. This is 
a broken heuristic because the PCI Express firmware spec doesn't require 
that it be so reserved, it only needs to be reserved as an ACPI 
motherboard resource, and so many times it's not reserved in E820 
despite being completely valid and working. The 
mmconfig-validate-against-acpi-motherboard-resources.patch changes this 
to validate against the ACPI motherboard resources instead.

The second problem is that on some machines, when we are doing BAR 
sizing on PCI devices, and write all ones to a BAR in order to determine 
how many bits "stick", the BAR ends up overlapping with the MCFG area. 
On some chipsets, this causes writes to the MCFG area (like, to restore 
the original BAR contents) to get decoded by the device instead of by 
the MCFG mechanism, which means the BAR stays disabled and configuration 
access stops working, wreaking havoc. Usually on these machines the 
MMCONFIG is located near the top of 32-bit memory and the PCI device 
causing problems is a PCI Express graphics card. 
pci-disable-decode-of-io-memory-during-bar-sizing.patch, and its 
successors, switch off the device's decoding during sizing so that it 
won't absorb the accesses to the MCFG table.

The concern raised was that this might affect some devices negatively. 
We do avoid disabling decode on host bridges, as it's known that some of 
them disable RAM access when you turn decode off, stupidly. I've yet to 
hear of any other conclusive case where disabling the decode is harmful. 
In general, if disabling the decode causes issues, the mere fact of 
doing the BAR sizing could cause the same issues, and that is unavoidable.

The other possible workaround would be to avoid using MMCONFIG until the 
BAR sizing is done. However, this seems like a poor solution. First of 
all, in the future there may come machines where MMCONFIG is the only 
config mechanism (or, perhaps more likely, it becomes the only tested 
one, so the old methods get broken). Secondly, what happens with 
hot-plug devices that need to be sized after MMCONFIG gets turned on?

The only way these two patches are related is that the E820 check 
happens to wrongly disable MMCONFIG on some of the machines where the 
memory areas could overlap during sizing, so removing that check alone 
without fixing the overlap issue could cause breakage on some machines. 
However, this is purely by chance, and it doesn't prevent the breakage 
on many other machines - as well as the one mentioned in the earlier 
thread, there's this one:

https://bugzilla.redhat.com/show_bug.cgi?id=251493

-- 
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@...pamshaw.ca
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ