[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211224001716.GA1324143@bhelgaas>
Date: Thu, 23 Dec 2021 18:17:16 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Shuai Xue <xueshuai@...ux.alibaba.com>,
"Rafael J. Wysocki" <rjw@...ysocki.net>
Cc: bp@...en8.de, tony.luck@...el.com, james.morse@....com,
lenb@...nel.org, bhelgaas@...gle.com,
zhangliguang@...ux.alibaba.com, zhuo.song@...ux.alibaba.com,
linux-kernel@...r.kernel.org, linux-acpi@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-pci@...r.kernel.org
Subject: Re: [RESEND PATCH v4] ACPI: Move sdei_init and ghes_init ahead to
handle platform errors earlier
[+to Rafael, question about HEST/GHES/SDEI init]
On Thu, Dec 23, 2021 at 04:11:11PM +0800, Shuai Xue wrote:
> 在 2021/12/22 AM7:17, Bjorn Helgaas 写道:
> > On Thu, Dec 16, 2021 at 09:34:56PM +0800, Shuai Xue wrote:
> >> On an ACPI system, ACPI is initialised very early from a subsys_initcall(),
> >> while SDEI is not ready until a subsys_initcall_sync().
> >>
> >> The SDEI driver provides functions (e.g. apei_sdei_register_ghes,
> >> apei_sdei_unregister_ghes) to register or unregister event callback for
> >> dispatcher in firmware. When the GHES driver probing, it registers the
> >> corresponding callback according to the notification type specified by
> >> GHES. If the GHES notification type is SDEI, the GHES driver will call
> >> apei_sdei_register_ghes to register event call.
> >>
> >> When the firmware emits an event, it migrates the handling of the event
> >> into the kernel at the registered entry-point __sdei_asm_handler. And
> >> finally, the kernel will call the registered event callback and return
> >> status_code to indicate the status of event handling. SDEI_EV_FAILED
> >> indicates that the kernel failed to handle the event.
> >>
> >> Consequently, when an error occurs during kernel booting, the kernel is
> >> unable to handle and report errors until the GHES driver is initialized by
> >> device_initcall(), in which the event callback is registered. All errors
> >> that occurred before GHES initialization are missed and there is no chance
> >> to report and find them again.
> >>
> >> From commit e147133a42cb ("ACPI / APEI: Make hest.c manage the estatus
> >> memory pool") was merged, ghes_init() relies on acpi_hest_init() to manage
> >> the estatus memory pool. On the other hand, ghes_init() relies on
> >> sdei_init() to detect the SDEI version and the framework for registering
> >> and unregistering events.
> >
> >> By the way, I don't figure out why acpi_hest_init is called in
> >> acpi_pci_root_init, it don't rely on any other thing. May it could
> >> be moved further, following acpi_iort_init in acpi_init.
> >> sdei_init() relies on ACPI table which is initialized
> >> subsys_initcall(): acpi_init(), acpi_bus_init(), acpi_load_tables(),
> >> acpi_tb_laod_namespace(). May it should be also moved further,
> >> after acpi_load_tables.
> >> In this patch, move sdei_init and ghes_init as far ahead as
> >> possible, right after acpi_hest_init().
> >
> > I'm having a hard time figuring out the reason for this patch.
> >
> > Apparently the relevant parts are sdei_init() and ghes_init().
> > Today they are executed in that order:
> >
> > subsys_initcall_sync(sdei_init);
> > device_initcall(ghes_init);
> >
> > After this patch, they would be executed in the same order, but called
> > explicitly instead of as initcalls:
> >
> > acpi_pci_root_init()
> > {
> > acpi_hest_init();
> > sdei_init();
> > ghes_init();
> > ...
> >
> > Explicit calls are certainly better than initcalls, but that doesn't
> > seem to be the reason for this patch.
> >
> > Does this patch fix a bug? If so, what is the bug?
>
> Yes. When the kernel booting, the console logs many times from firmware
> before GHES drivers init:
>
> Trip in MM PCIe RAS handle(Intr:910)
> Clean PE[1.1.1] ERR_STS:0x4000100 -> 0 INT_STS:F0000000
> Find RP(98:1.0)
> --Walk dev(98:1.0) CE:0 UCE:4000
> ...
> ERROR: sdei_dispatch_event(32a) ret:-1
> --handler(910) end
>
> This is because the callback function has not been registered yet.
> Previously reported errors will be overwritten by new ones. Therefore,
> all errors that occurred before GHES initialization are missed
> and there is no chance to report and find them again.
>
> > You say that currently "errors that occur before GHES initialization
> > are missed". Isn't that still true after this patch? Does this patch
> > merely reduce the time before GHES initialization? If so, I'm
> > dubious, because we have to tolerate an arbitrary amount of time
> > there.
>
> After this patch, there are still errors missing. As you mentioned,
> we have to tolerate it until the software reporting capability is built.
>
> Yes, this patch merely reduce the time before GHES initialization.
It's not a bug that errors that happen before the callback are lost.
At least, it's not a *Linux* bug. It might be a poor design of the
firmware error reporting interface.
If the only point of this patch is to reduce the time before GHES
initialization, the commit log should clearly say that.
> The boot dmesg before this patch:
>
> [ 3.688586] HEST: Table parsing has been initialized.
> ...
> [ 33.204340] calling sdei_init+0x0/0x120 @ 1
> [ 33.208645] sdei: SDEIv1.0 (0x0) detected in firmware.
> ...
> [ 36.005390] calling ghes_init+0x0/0x11c @ 1
> [ 36.190021] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC.
>
>
> After this patch, the boot dmesg like bellow:
>
> [ 3.688664] HEST: Table parsing has been initialized.
> [ 3.688691] sdei: SDEIv1.0 (0x0) detected in firmware.
> [ 3.694557] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC.
[Tangent: I think this GHES message is confusing. What "APEI bit"
does this refer to? The only bits I remember are the Flags bits in
HEST Error Source Descriptor Entries, e.g., ACPI v6.3, sec 18.3.2.
"WHEA _OSC" means nothing to me, and I didn't find anything useful
with grep, other than that "WHEA" might be an obsolete name for what
we now call "APEI".
I don't think there's anything in _OSC that mentions "firmware first."
I don't remember anything in the spec about a way to *enable* Firmware
First Error Handling needing (I'm looking at ACPI v6.3, sec 18.4).
I think the "firmware first" information is useless to the OS -- as
far as I can tell, the spec says nothing about anything the OS should
do based on the FIRMWARE_FIRST bits.]
> As we can see, the initialization of GHES is advanced by 33 seconds.
> So, in my opinion, this patch is necessary, right?
> (It should be noted that the effect of optimization varies with the platform.)
> >> -device_initcall(ghes_init);
> >
> >> void __init acpi_pci_root_init(void)
> >> {
> >> acpi_hest_init();
> >> + sdei_init();
> >> + ghes_init();
> >
> > What's the connection between PCI, SDEI, and GHES? As far as I can
> > tell, SDEI and GHES are not PCI-specific, so it doesn't seem like they
> > should be initialized here in acpi_pci_root_init().
>
> The only reason is that acpi_hest_init() is initialized here.
>
> From commit e147133a42cb ("ACPI / APEI: Make hest.c manage the estatus
> memory pool") was merged, ghes_init() relies on acpi_hest_init() to manage
> the estatus memory pool. On the other hand, ghes_init() relies on
> sdei_init() to detect the SDEI version and the framework for registering
> and unregistering events. The dependencies are as follows
>
> ghes_init() => acpi_hest_init()
> ghes_init() => sdei_init()
>
> I don't figure out why acpi_hest_init() is called in
> acpi_pci_root_init(), it don't rely on any other thing.
> I am wondering that should we moved all of them further? e.g.
> following acpi_iort_init() in acpi_init().
I don't know why acpi_hest_init() is called from acpi_pci_root_init().
It looks like HEST can support error sources other than PCI (IA-32
Machine Checks, NMIs, GHES, etc.) It was added by 415e12b23792
("PCI/ACPI: Request _OSC control once for each root bridge (v3)");
maybe Rafael remembers why.
Seem like acpi_hest_init(), sdei_init(), and ghes_init() should all go
somewhere else, but I don't know where. Maybe somewhere in
acpi_init()?
Bjorn
Powered by blists - more mailing lists