lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2026012214-dinginess-improper-9bc1@gregkh>
Date: Thu, 22 Jan 2026 11:29:15 +0100
From: Greg KH <gregkh@...uxfoundation.org>
To: Yicong Yang <yang.yicong@...oheart.com>
Cc: rafael@...nel.org, lenb@...nel.org, tglx@...nel.org, dakr@...nel.org,
	akpm@...ux-foundation.org, apatel@...tanamicro.com, pjw@...nel.org,
	palmer@...belt.com, aou@...s.berkeley.edu, alex@...ti.fr,
	geshijian@...oheart.com, weidong.wd@...oheart.com,
	linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-riscv@...ts.infradead.org
Subject: Re: [PATCH 1/2] ACPI: scan: Use async schedule function for
 acpi_scan_clear_dep_fn

On Thu, Jan 22, 2026 at 03:34:45PM +0800, Yicong Yang wrote:
> The device object rescan in acpi_scan_clear_dep_fn is scheduled
> in the system workqueue which is not guaranteed to be finished
> before entering userspace. This will cause the problem that
> some key devices are missed when the init task try to find them,
> e.g. console devices and root devices (PCIe nvme, etc).
> This issues is more possbile to happen on RISCV since these
> devices using GSI interrupt may depend on APLIC and will be
> scanned in acpi_scan_clear_dep_queue() after APLIC initialized.
> 
> Fix this by scheduling the acpi_scan_clear_dep_queue() using async
> schedule function rather than the system workqueue. The deferred
> works will be synchronized by async_synchronize_full() before
> entering init task.
> 
> Update the comment as well.
> 
> Signed-off-by: Yicong Yang <yang.yicong@...oheart.com>
> ---
>  drivers/acpi/scan.c | 35 ++++++++++++++++-------------------
>  1 file changed, 16 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 416d87f9bd10..bf0d8ba9ba19 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -5,6 +5,7 @@
>  
>  #define pr_fmt(fmt) "ACPI: " fmt
>  
> +#include <linux/async.h>
>  #include <linux/module.h>
>  #include <linux/init.h>
>  #include <linux/slab.h>
> @@ -2365,39 +2366,35 @@ struct acpi_scan_clear_dep_work {
>  	struct acpi_device *adev;
>  };
>  
> -static void acpi_scan_clear_dep_fn(struct work_struct *work)
> +static void acpi_scan_clear_dep_fn(void *dev, async_cookie_t cookie)
>  {
> -	struct acpi_scan_clear_dep_work *cdw;
> -
> -	cdw = container_of(work, struct acpi_scan_clear_dep_work, work);
> +	struct acpi_device *adev = to_acpi_device(dev);
>  
>  	acpi_scan_lock_acquire();
> -	acpi_bus_attach(cdw->adev, (void *)true);
> +	acpi_bus_attach(adev, (void *)true);
>  	acpi_scan_lock_release();
>  
> -	acpi_dev_put(cdw->adev);
> -	kfree(cdw);
> +	acpi_dev_put(adev);
>  }
>  
>  static bool acpi_scan_clear_dep_queue(struct acpi_device *adev)
>  {
> -	struct acpi_scan_clear_dep_work *cdw;
> -
>  	if (adev->dep_unmet)
>  		return false;
>  
> -	cdw = kmalloc(sizeof(*cdw), GFP_KERNEL);
> -	if (!cdw)
> -		return false;
> -
> -	cdw->adev = adev;
> -	INIT_WORK(&cdw->work, acpi_scan_clear_dep_fn);
>  	/*
> -	 * Since the work function may block on the lock until the entire
> -	 * initial enumeration of devices is complete, put it into the unbound
> -	 * workqueue.
> +	 * Async schedule the deferred acpi_scan_clear_dep_fn() since:
> +	 * - acpi_bus_attach() needs to hold acpi_scan_lock which cannot
> +	 *   be acquired under acpi_dep_list_lock (held here)
> +	 * - the deferred work at boot stage is ensured to be finished
> +	 *   before entering init task by the async_synchronize_full()
> +	 *   barrier
> +	 *
> +	 * Use _nocall variant since it'll return on failure instead of
> +	 * run the function synchronously.
>  	 */
> -	queue_work(system_dfl_wq, &cdw->work);
> +	if (!async_schedule_dev_nocall(acpi_scan_clear_dep_fn, &adev->dev))
> +		return false;

This really feels wrong to me, you are taking a code path that has been
working for quite a while and changing it.  Perhaps your system ACPI
tables are the thing that is incorrect here?

What exactly is the problem that you are seeing?  Why not start with
that and then we can work out how to solve that issue?

thanks,

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ