lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJZ5v0igWB+c1ER15_0FSEBXfAaAUBXRwx3K_rt-POdsAYui8Q@mail.gmail.com>
Date: Tue, 27 Jan 2026 21:56:29 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Yicong Yang <yang.yicong@...oheart.com>
Cc: rafael@...nel.org, lenb@...nel.org, tglx@...nel.org, 
	gregkh@...uxfoundation.org, dakr@...nel.org, akpm@...ux-foundation.org, 
	apatel@...tanamicro.com, pjw@...nel.org, palmer@...belt.com, 
	aou@...s.berkeley.edu, alex@...ti.fr, geshijian@...oheart.com, 
	weidong.wd@...oheart.com, linux-acpi@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-riscv@...ts.infradead.org
Subject: Re: [PATCH v2] ACPI: scan: Use async schedule function for acpi_scan_clear_dep_fn

On Mon, Jan 26, 2026 at 8:04 AM Yicong Yang <yang.yicong@...oheart.com> wrote:
>
> The device object rescan in acpi_scan_clear_dep_fn is scheduled
> in the system workqueue which is not guaranteed to be finished
> before entering userspace. This will cause the problem that
> some key devices are missing when userspace init task tries to
> find them. Two issues observed on our RISCV platforms:
>
> - kernel panic due to userspace init cannot have an opened
>   console. the console device scanning is queued by
>   acpi_scan_clear_dep_queue and not finished by the time
>   userspace init process running, thus by the time userspace
>   init running, no console is created
> - entering rescue shell due to no root devices (PCIe nvme in
>   our case) found. same reason as above, the PCIe host bridge
>   scanning is queued in above and finished after init process
>   running.
>
> The reason is because both devices (console, PCIe host bridge)
> depend on riscv-aplic irqchip to serve their interrupts (console's
> wired interrupt and PCI's INTx interrupts). In order to keep the
> dependency these devices are scanned and created after riscv-aplic
> initialized. The riscv-aplic is initialized in device_initcall and
> queue the device scan work with acpi_scan_clear_dep_queue, it's
> close to the time running userspace init process. Since system_dfl_wq
> is used in acpi_scan_clear_dep_queue and no synchronization,
> the issues will happen if userspace init runs before these devices
> are ready.
>
> The solution is to wait for the queued work finished before
> entering userspace init. One possible way is to use a dedicated
> workqueue instead of the system_dfl_wq, and explicitly flush
> it somewhere in the initcall stage before entering userspace.
> One other way is to use async_schedule_dev_nocall() for these
> device scanning. It's designed for asynchronous initialization
> and will work same as before since it's using a dedicated
> unbound workqueue as well, but the kernel init code will
> wait for the work finished (async_synchronize_full) right before
> entering userspace init.
>
> This patch use the second approach. Compared to a dedicated
> workqueue, it's simpler since the async schedule framework have
> handled most works like synchronization, memory allocation of
> works and workqueue. The ACPI code only needs to focus on its
> work. A dedicated workqueue for this could also be redundant
> since some platforms don't need acpi_scan_clear_dep_queue()
> for their device scanning.
>
> Signed-off-by: Yicong Yang <yang.yicong@...oheart.com>
> ---
> Change since v1:
> Refine the commit message to:
> - include the issues and the analysis
> - include the reason for using the async schedule rather than
>   a dedicated workqueue
> Link: https://lore.kernel.org/linux-riscv/20260122073446.45628-2-yang.yicong@picoheart.com/
>
>  drivers/acpi/scan.c | 40 ++++++++++++++++------------------------
>  1 file changed, 16 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 416d87f9bd10..64fcbd6a6adc 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -5,6 +5,7 @@
>
>  #define pr_fmt(fmt) "ACPI: " fmt
>
> +#include <linux/async.h>
>  #include <linux/module.h>
>  #include <linux/init.h>
>  #include <linux/slab.h>
> @@ -2360,44 +2361,35 @@ static int acpi_dev_get_next_consumer_dev_cb(struct acpi_dep_data *dep, void *da
>         return 0;
>  }
>
> -struct acpi_scan_clear_dep_work {
> -       struct work_struct work;
> -       struct acpi_device *adev;
> -};
> -
> -static void acpi_scan_clear_dep_fn(struct work_struct *work)
> +static void acpi_scan_clear_dep_fn(void *dev, async_cookie_t cookie)
>  {
> -       struct acpi_scan_clear_dep_work *cdw;
> -
> -       cdw = container_of(work, struct acpi_scan_clear_dep_work, work);
> +       struct acpi_device *adev = to_acpi_device(dev);
>
>         acpi_scan_lock_acquire();
> -       acpi_bus_attach(cdw->adev, (void *)true);
> +       acpi_bus_attach(adev, (void *)true);
>         acpi_scan_lock_release();
>
> -       acpi_dev_put(cdw->adev);
> -       kfree(cdw);
> +       acpi_dev_put(adev);
>  }
>
>  static bool acpi_scan_clear_dep_queue(struct acpi_device *adev)
>  {
> -       struct acpi_scan_clear_dep_work *cdw;
> -
>         if (adev->dep_unmet)
>                 return false;
>
> -       cdw = kmalloc(sizeof(*cdw), GFP_KERNEL);
> -       if (!cdw)
> -               return false;
> -
> -       cdw->adev = adev;
> -       INIT_WORK(&cdw->work, acpi_scan_clear_dep_fn);
>         /*
> -        * Since the work function may block on the lock until the entire
> -        * initial enumeration of devices is complete, put it into the unbound
> -        * workqueue.
> +        * Async schedule the deferred acpi_scan_clear_dep_fn() since:
> +        * - acpi_bus_attach() needs to hold acpi_scan_lock which cannot
> +        *   be acquired under acpi_dep_list_lock (held here)
> +        * - the deferred work at boot stage is ensured to be finished
> +        *   before userspace init task by the async_synchronize_full()
> +        *   barrier
> +        *
> +        * Use _nocall variant since it'll return on failure instead of
> +        * run the function synchronously.
>          */
> -       queue_work(system_dfl_wq, &cdw->work);
> +       if (!async_schedule_dev_nocall(acpi_scan_clear_dep_fn, &adev->dev))
> +               return false;
>
>         return true;

What about doing

    return !!async_schedule_dev_nocall(acpi_scan_clear_dep_fn, &adev->dev);

here?

>  }
> --

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ