[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <07a701dc964d$0f0c1310$2d243930$@trustnetic.com>
Date: Thu, 5 Feb 2026 11:11:02 +0800
From: Jiawen Wu <jiawenwu@...stnetic.com>
To: "'Bjorn Helgaas'" <helgaas@...nel.org>
Cc: "'Rafael J. Wysocki'" <rafael@...nel.org>,
"'Tony Luck'" <tony.luck@...el.com>,
"'Borislav Petkov'" <bp@...en8.de>,
"'Hanjun Guo'" <guohanjun@...wei.com>,
"'Mauro Carvalho Chehab'" <mchehab@...nel.org>,
"'Shuai Xue'" <xueshuai@...ux.alibaba.com>,
"'Len Brown'" <lenb@...nel.org>,
"'Shiju Jose'" <shiju.jose@...wei.com>,
"'Bjorn Helgaas'" <bhelgaas@...gle.com>,
<linux-acpi@...r.kernel.org>,
<linux-kernel@...r.kernel.org>,
"'Rafael J. Wysocki'" <rafael@...nel.org>,
"'Tony Luck'" <tony.luck@...el.com>,
"'Borislav Petkov'" <bp@...en8.de>,
"'Hanjun Guo'" <guohanjun@...wei.com>,
"'Mauro Carvalho Chehab'" <mchehab@...nel.org>,
"'Shuai Xue'" <xueshuai@...ux.alibaba.com>,
"'Len Brown'" <lenb@...nel.org>,
"'Shiju Jose'" <shiju.jose@...wei.com>,
"'Bjorn Helgaas'" <bhelgaas@...gle.com>,
<linux-acpi@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] ACPI: APEI: Avoid NULL pointer dereference in ghes_estatus_pool_region_free
On Thu, Feb 5, 2026 5:46 AM, Bjorn Helgaas wrote:
> On Wed, Feb 04, 2026 at 10:03:34AM +0800, Jiawen Wu wrote:
> > On Wed, Feb 4, 2026 6:55 AM, Bjorn Helgaas wrote:
> > > On Tue, Feb 03, 2026 at 10:12:32AM +0800, Jiawen Wu wrote:
> > > > The function ghes_estatus_pool_region_free() is exported and be called
> > > > by the PCIe AER recovery path, which unconditionally invokes it to free
> > > > aer_capability_regs memory.
> > > >
> > > > Although current AER usage assumes memory comes from the GHES pool,
> > > > robustness requires guarding against pool unavailability. Add a NULL check
> > > > before calling gen_pool_free() to prevent crashes when the pool is not
> > > > initialized. This also makes the API safer for potential future use by
> > > > non-GHES callers.
> > >
> > > I'm not sure what you mean by "pool unavailability." I think getting
> > > here with ghes_estatus_pool==NULL means we have a logic error
> > > somewhere, and I don't think we should silently hide that error.
> > >
> > > I'm generally in favor of *not* checking so we find out if the caller
> > > forgot to keep track of the pointer correctly.
> >
> > "pool unavailability" means that when I attempt to call
> > aer_recover_queue() in a ethernet driver, which does not create
> > ghes_estatus_pool, it leads to a NULL pointer dereference.
>
> I guess that means you contemplate having an ethernet driver allocate
> and manage its own struct aer_capability_regs to pass to
> aer_recover_queue(). But I don't understand why such a driver would
> be involved in this part of the AER processing.
>
> Normally a device like a NIC that detects an error logs something in
> its local AER Capability, then sends an ERR_* message upstream. The
> Root Port that receives that ERR_* message generates an interrupt. In
> the native AER case, the Linux AER driver handles that interrupt,
> reads the error logs from the AER Capability of the device that sent
> the ERR_* message, and logs it. In the firmware-first case used by
> GHES, platform firmware handles the interrupt, reads the error logs,
> packages them up, and sends them to the Linux AER driver via GHES and
> aer_recover_queue().
>
> What's the PCIe hardware flow that would lead to an ethernet driver
> calling aer_recover_queue()? An Endpoint driver wouldn't receive the
> AER interrupt generated by the Root Port.
>
> I suppose a NIC could generate its own device-specific interrupt when
> it logs an error in its local AER Capability, but if it conforms to
> the PCIe spec, it should also send an ERR_* message, which would feed
> into the existing AER path. I don't think we'd want the existing AER
> path racing with a parallel AER path in the Endpoint driver.
Thank you for your detailed explanation.
I fully agree that aer_recover_queue() is intended for firmware-first error
reporting via GHES, and an endpoint driver should not normally invoke it
directly.
However, in practice, we've encountered platforms where AER interrupts are not
delivered reliably. For example, due to BIOS misconfiguration, disabled AER in
firmware, or hardware that fails to generate ERR_* messages correctly. On such
systems, when a PCIe error occurs, the standard AER path is never triggered,
and the device remains in a stuck state.
To verify this, I simulated a PCIE error by injecting it into the NIC register.
But the Linux AER driver didn't respond at all, on many platforms.
As a device driver, we'd like to ensure best-effort recovery regardless of
platform AER support. Since pcie_do_recovery() encapsulates the complete and
correct recovery sequence, it's exactly what we need-but it's not exported.
Given this, could you advise on the proper way for an endpoint driver to
initiate full PCIe error recovery when AER is unavailable? Is there a
recommended pattern that safely achieves the same effect as pcie_do_recovery()
without duplicating its logic?
Thank you again for your guidance.
Powered by blists - more mailing lists