[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJZ5v0g90=6KzV9=efVsJLHqjE=+M6=mbYy2Xym53es4RdszmQ@mail.gmail.com>
Date: Wed, 29 Jan 2025 10:44:13 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Nathan Chancellor <nathan@...nel.org>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Xiaofei Tan <tanxiaofei@...wei.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>, lenb@...nel.org, linux-acpi@...r.kernel.org,
linux-kernel@...r.kernel.org, mchehab+huawei@...nel.org,
roberto.sassu@...wei.com, shiju.jose@...wei.com, prime.zeng@...ilicon.com,
linuxarm@...wei.com
Subject: Re: [PATCH v3] acpi: Fix HED module initialization order when it is built-in
On Wed, Jan 29, 2025 at 5:33 AM Nathan Chancellor <nathan@...nel.org> wrote:
>
> On Thu, Jan 23, 2025 at 08:35:51PM +0100, Rafael J. Wysocki wrote:
> > On Tue, Jan 21, 2025 at 3:23 AM Xiaofei Tan <tanxiaofei@...wei.com> wrote:
> > >
> > >
> > > 在 2025/1/20 19:04, Jonathan Cameron 写道:
> > > > On Fri, 17 Jan 2025 10:29:57 +0800
> > > > Xiaofei Tan <tanxiaofei@...wei.com> wrote:
> > > >
> > > >> When the module HED is built-in, the module HED init is behind EVGED
> > > >> as the driver are in the same initcall level, then the order is determined
> > > >> by Makefile order. That order violates expectations. Because RAS records
> > > >> can't be handled in the special time window that EVGED has initialized
> > > >> while HED not.
> > > >>
> > > >> If the number of such RAS records is more than the APEI HEST error source
> > > >> number, the HEST resources could be occupied all, and then could affect
> > > >> subsequent RAS error reporting.
> > > >>
> > > >> Change the initcall level of HED to subsys_init to fix the issue. If build
> > > >> HED as a module, the problem remains. To solve this problem completely,
> > > >> change the ACPI_HED from tristate to bool.
> > > >>
> > > >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> > > > Given the change in approach (even though I reviewed this internally)
> > > > should probably have dropped my RB. Anyhow, consider this me
> > > > giving it again on list.
> > > OK. thanks.
> >
> > Applied as 6.14-rc material with a rewritten changelog and under a new
> > subject: "ACPI: HED: Always initialize before evged".
> >
> > Thanks!
>
> For what it's worth, I just bisected a new error message that I see when
> booting several x86_64 distribution configurations in QEMU to this
> change in -next as commit 19badc4e57c6 ("ACPI: HED: Always initialize
> before evged"):
>
> $ curl -LSso .config https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/raw/main/config
>
> $ make -skj"$(nproc)" ARCH=x86_64 CROSS_COMPILE=x86_64-linux- olddefconfig bzImage
>
> $ qemu-system-x86_64 \
> -display none \
> -nodefaults \
> -M q35 \
> -d unimp,guest_errors \
> -append 'console=ttyS0 earlycon=uart8250,io,0x3f8' \
> -kernel arch/x86/boot/bzImage \
> -initrd rootfs.cpio \
> -cpu host \
> -enable-kvm \
> -m 512m \
> -smp 8
> -serial mon:stdio
> ...
> [ 0.535126] Error: Driver 'hardware_error_device' is already registered, aborting...
> ...
>
> If there is any additional information I can provide or patches I can
> test, I am more than happy to do so. Apologies if this has already been
> reported or resolved, I did a search on the mailing list and I did not
> see anything.
No, it hasn't.
So AFAICS the commit in question needs to do more to switch over hed
to non-modular.
I'll drop it for now, thanks!
Powered by blists - more mailing lists