lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 2 May 2018 14:29:40 -0500
From:   "Alex G." <mr.nuke.me@...il.com>
To:     Pavel Machek <pavel@....cz>, Borislav Petkov <bp@...en8.de>
Cc:     linux-acpi@...r.kernel.org, linux-edac@...r.kernel.org,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Len Brown <lenb@...nel.org>, Tony Luck <tony.luck@...el.com>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        Robert Moore <robert.moore@...el.com>,
        Erik Schmauss <erik.schmauss@...el.com>,
        Tyler Baicar <tbaicar@...eaurora.org>,
        Will Deacon <will.deacon@....com>,
        James Morse <james.morse@....com>,
        Shiju Jose <shiju.jose@...wei.com>,
        "Jonathan (Zhixiong) Zhang" <zjzhang@...eaurora.org>,
        Dongjiu Geng <gengdongjiu@...wei.com>,
        linux-kernel@...r.kernel.org, devel@...ica.org
Subject: Re: [RFC PATCH v3 3/3] acpi: apei: Warn when GHES marks correctable
 errors as "fatal"

On 05/02/2018 02:10 PM, Pavel Machek wrote:
> On Thu 2018-04-26 13:20:57, Borislav Petkov wrote:
>> On Wed, Apr 25, 2018 at 03:39:51PM -0500, Alexandru Gagniuc wrote:
>>> There seems to be a culture amongst BIOS teams to want to crash the
>>> OS when an error can't be handled in firmware. Marking GHES errors as
>>> "fatal" is a very common way to do this.
>>>
>>> However, a number of errors reported by GHES may be fatal in the sense
>>> a device or link is lost, but are not fatal to the system. When there
>>> is a disagreement with firmware about the handleability of an error,
>>> print a warning message.
> 
> 
>>> +
>>> +	if ((sev >= GHES_SEV_PANIC) && (ghes_actual_severity(ghes) < sev)) {
>>> +		pr_warn("FIRMWARE BUG: Firmware sent fatal error that we were able to correct");
>>> +		pr_warn("BROKEN FIRMWARE: Complain to your hardware vendor");
>>
>> Pasting the same comment from last time since you missed it:
>>
>> "No, I don't want any of that crap issuing stuff in dmesg and then people
>> opening bugs and running around and trying to replace hardware.
> 
> We want to see warnings. Maybe they can be toned done. We even have
> dedicated distros for firmware testing.

I'm told that had we had this warning when the r740 BIOS was in
development, we would have solved a lot of the issues that I'm currently
working on. That would, in turn, have exposed bigger issues, and we
would have had a platform to fix and test those bigger issues.

Hardware vendors who test on linux might be scratching their heads at
this error, though they tend to figure out what they're doing wrong, and
fix it.

One argument against was "expensive support calls", on which I call BS.
The firmware resources are expensive, but those are there whether or not
the customers call to complain.

Alex

>> Good mailing practices for 400: avoid top-posting and trim the reply.
> 
> Good mailing practices -- limit use of four letter words on public lists.

Then can't show word 'four'.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ