[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170731170017.2vwxhewivgpyvpea@intel.com>
Date: Mon, 31 Jul 2017 10:00:18 -0700
From: "Luck, Tony" <tony.luck@...el.com>
To: "Baicar, Tyler" <tbaicar@...eaurora.org>
Cc: Borislav Petkov <bp@...e.de>, rjw@...ysocki.net, lenb@...nel.org,
will.deacon@....com, james.morse@....com, shiju.jose@...wei.com,
geliangtang@...il.com, andriy.shevchenko@...ux.intel.com,
linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] acpi: apei: clear error status before acknowledging the
error
On Mon, Jul 31, 2017 at 10:15:27AM -0600, Baicar, Tyler wrote:
> I think the better thing to do in this case is still send the ack. If
> ghes_read_estatus() fails, then
> either we are unable to read the estatus or the estatus is empty/invalid.
Right now we silently handle that failure of ghes_read_estatus(). That
might be hiding some Linux bugs if we are calling ghes_proc() in cases
where we shouldn't.
Perhaps we should have something like this, so if systems do start acting
weirdly there will be a note that we took this path:
rc = ghes_read_estatus(ghes, 0);
if (rc) {
pr_notice("surprise failure reading ghes estatus\n");
goto out;
}
> If we do not send the ack, then we will be in a scenario where FW will not
> send any more errors.
We might ACK something that the firmware didn't send, which may
lead to other problems.
> I think it would be better to still have the FW send the errors and kernel
> complain about issues with
But I agree with this. We should send the ACK. Luckliy this doesn't have
a long legacy problem because the whole ACK mechanism is a new thing. So
we only have to worry about GHESv2 supporting BIOS.
-Tony
Powered by blists - more mailing lists