[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180511162951.GH12705@pd.tnic>
Date:   Fri, 11 May 2018 18:29:51 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     "Alex G." <mr.nuke.me@...il.com>
Cc:     alex_gagniuc@...lteam.com, austin_bolen@...l.com,
        shyam_iyer@...l.com, "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Len Brown <lenb@...nel.org>, Tony Luck <tony.luck@...el.com>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        Robert Moore <robert.moore@...el.com>,
        Erik Schmauss <erik.schmauss@...el.com>,
        Tyler Baicar <tbaicar@...eaurora.org>,
        Will Deacon <will.deacon@....com>,
        James Morse <james.morse@....com>,
        Shiju Jose <shiju.jose@...wei.com>,
        "Jonathan (Zhixiong) Zhang" <zjzhang@...eaurora.org>,
        Dongjiu Geng <gengdongjiu@...wei.com>,
        linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-edac@...r.kernel.org, devel@...ica.org
Subject: Re: [RFC PATCH v4 3/3] acpi: apei: Do not panic() on PCIe errors
 reported through GHES
On Fri, May 11, 2018 at 11:12:25AM -0500, Alex G. wrote:
> > I think *you* didn't get it: IS_ENABLED(CONFIG_ACPI_APEI_PCIEAER) is not
> > enough of a check to confirm that there actually *is* an AER driver to
> > handle the errors. If you really want to make sure the driver is loaded
> > and functioning, then you need an explicit registering mechanism or some
> > other way of checking it really is there and handling errors.
> 
> config ACPI_APEI_PCIEAER
> 	bool "APEI PCIe AER logging/recovering support"
> 	depends on ACPI_APEI && PCIEAER
> 	help
> 	  PCIe AER errors may be reported via APEI firmware first mode.
> 	  Turn on this option to enable the corresponding support.
> 
> PCIAER is not modularizable. QED
QED my ass.
Read the f*ck my email again: the presence of the *code* is
not enough of a check to confirm the error has been handled.
aer_recover_work_func() can fail as that kfifo_put() in
aer_recover_queue() can too.
You need an *actual* confirmation that the error has been handled
properly and *only* *then* not panic the system. Otherwise you are
potentially leaving those errors unhandled.
-- 
Regards/Gruss,
    Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Powered by blists - more mailing lists
 
