lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 19 Apr 2022 17:16:44 +0200
From:   Hans de Goede <hdegoede@...hat.com>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     "Rafael J . Wysocki" <rjw@...ysocki.net>,
        Borislav Petkov <bp@...en8.de>,
        "H . Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        Mika Westerberg <mika.westerberg@...ux.intel.com>,
        Krzysztof Wilczyński <kw@...ux.com>,
        Myron Stowe <myron.stowe@...hat.com>,
        Juha-Pekka Heikkila <juhapekka.heikkila@...il.com>,
        Benoit Grégoire <benoitg@...us.ca>,
        Hui Wang <hui.wang@...onical.com>,
        Kai-Heng Feng <kai.heng.feng@...onical.com>,
        linux-acpi@...r.kernel.org, linux-pci@...r.kernel.org,
        x86@...nel.org, linux-kernel@...r.kernel.org,
        Bjorn Helgaas <bhelgaas@...gle.com>
Subject: Re: [PATCH v2 0/3] x86/PCI: Log E820 clipping

Hi,

On 4/19/22 17:03, Bjorn Helgaas wrote:
> On Tue, Apr 19, 2022 at 11:59:17AM +0200, Hans de Goede wrote:
>> On 1/1/70 01:00, Bjorn Helgaas wrote:
>>> This is still work-in-progress on the issue of PNP0A03 _CRS methods that
>>> are buggy or not interpreted correctly by Linux.
>>>
>>> The previous try at:
>>>   https://lore.kernel.org/r/20220304035110.988712-1-helgaas@kernel.org
>>> caused regressions on some Chromebooks:
>>>   https://lore.kernel.org/r/Yjyv03JsetIsTJxN@sirena.org.uk
>>>
>>> This v2 drops the commit that caused the Chromebook regression, so it also
>>> doesn't fix the issue we were *trying* to fix on Lenovo Yoga and Clevo
>>> Barebones.
>>>
>>> The point of this v2 update is to split the logging patch into (1) a pure
>>> logging addition and (2) the change to only clip PCI windows, which was
>>> previously hidden inside the logging patch and not well documented.
>>>
>>> Bjorn Helgaas (3):
>>>   x86/PCI: Eliminate remove_e820_regions() common subexpressions
>>>   x86: Log resource clipping for E820 regions
>>>   x86/PCI: Clip only host bridge windows for E820 regions
>>
>> Thanks, the entire series looks good to me:
>>
>> Reviewed-by: Hans de Goede <hdegoede@...hat.com>
> 
> Thank you!
> 
>> So what is the plan to actually fix the issue seen on some Lenovo models
>> and Clevo Barebones ?   As I mentioned previously I think that since all
>> our efforts have failed so far that we should maybe reconsider just
>> using DMI quirks to ignore the E820 reservation windows for host bridges
>> on affected models ?
> 
> I have been resisting DMI quirks but I'm afraid there's no other way.

Well there is the first match adjacent windows returned by _CRS and
only then do the "covers whole region" exception check. I still
think that would work at least for the chromebook regression...

So do you want me to give that a try; or shall I write a patch
using DMI quirks. And if we go the DMI quirks, what about
matching cmdline arguments?  If we add matching cmdline arguments,
which seems to be the sensible thing to do then to allow users
to test if they need the quirk, then we basically end up with my
first attempt at fixing this from 6 months ago:

https://lore.kernel.org/linux-pci/20211005150956.303707-1-hdegoede@redhat.com/

> I think the web we've gotten into, where vendors have used E820 to
> interact with _CRS in incompatible and undocumented ways, is not
> sustainable.
> 
> I'm not aware of any spec that says the OS should use E820 to clip
> things out of _CRS, so I think the long term plan should be to
> decouple them by default.

Right and AFAICT the reason Windows is getting away with this is
the same as with the original Dell _CRS has overlap with
physical RAM issue (1), Linux assigns address to unassigneds BAR-s
starting with the lowest available address in the bridge window,
where as Windows assigns addresses from the highest available
address in the window. So the real fix here might very well be
to rework the BAR assignment code to switch to fill the window
from the top rather then from the bottom. AFAICT all issues where
excluding _E820 reservations have helped are with _E820 - bridge
window overlaps at the bottom of the window.

IOW these are really all bugs in the _CRS method for the bridge,
which Windows does not hit because it never actually uses
the lowest address(es) of the _CRS returned window.

Regards,

Hans



1) At least I read in either a bugzilla, or email thread about
this that Windows allocating bridge window space from the top
was assumed to be why Windows was not impacted.





> Straw man:
> 
>   - Disable E820 clipping by default.
> 
>   - Add a quirk to enable E820 clipping for machines older than X,
>     e.g., 2023, to avoid breaking machines that currently work.
> 
>   - Add quirks to disable E820 clipping for individual machines like
>     the Lenovo and Clevos that predate X, but E820 clipping breaks
>     them.
> 
>   - Add quirks to enable E820 clipping for individual machines like
>     the Chromebooks (and probably machines we don't know about yet)
>     that have devices that consume part of _CRS but are not
>     enumerable.
> 
>   - Communicate this to OEMs to try to prevent future machines that
>     need quirks.
> 
> Bjorn
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ