[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <7512bdfc-96ff-480a-a4e2-3b544f86d59d@picoheart.com>
Date: Thu, 22 Jan 2026 20:50:33 +0800
From: "Yicong Yang" <yang.yicong@...oheart.com>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: <yang.yicong@...oheart.com>, <lenb@...nel.org>, <tglx@...nel.org>,
<gregkh@...uxfoundation.org>, <dakr@...nel.org>,
<akpm@...ux-foundation.org>, <apatel@...tanamicro.com>, <pjw@...nel.org>,
<palmer@...belt.com>, <aou@...s.berkeley.edu>, <alex@...ti.fr>,
<geshijian@...oheart.com>, <weidong.wd@...oheart.com>,
<linux-acpi@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linux-riscv@...ts.infradead.org>
Subject: Re: [PATCH 1/2] ACPI: scan: Use async schedule function for acpi_scan_clear_dep_fn
On 1/22/26 7:19 PM, Rafael J. Wysocki wrote:
> On Thu, Jan 22, 2026 at 8:35 AM Yicong Yang <yang.yicong@...oheart.com> wrote:
>>
>> The device object rescan in acpi_scan_clear_dep_fn is scheduled
>> in the system workqueue which is not guaranteed to be finished
>> before entering userspace. This will cause the problem that
>> some key devices are missed when the init task try to find them,
>> e.g. console devices and root devices (PCIe nvme, etc).
>> This issues is more possbile to happen on RISCV since these
>> devices using GSI interrupt may depend on APLIC and will be
>> scanned in acpi_scan_clear_dep_queue() after APLIC initialized.
>>
>> Fix this by scheduling the acpi_scan_clear_dep_queue() using async
>> schedule function rather than the system workqueue. The deferred
>> works will be synchronized by async_synchronize_full() before
>> entering init task.
>>
>> Update the comment as well.
>>
>> Signed-off-by: Yicong Yang <yang.yicong@...oheart.com>
>> ---
>> drivers/acpi/scan.c | 35 ++++++++++++++++-------------------
>> 1 file changed, 16 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
>> index 416d87f9bd10..bf0d8ba9ba19 100644
>> --- a/drivers/acpi/scan.c
>> +++ b/drivers/acpi/scan.c
>> @@ -5,6 +5,7 @@
>>
>> #define pr_fmt(fmt) "ACPI: " fmt
>>
>> +#include <linux/async.h>
>> #include <linux/module.h>
>> #include <linux/init.h>
>> #include <linux/slab.h>
>> @@ -2365,39 +2366,35 @@ struct acpi_scan_clear_dep_work {
>> struct acpi_device *adev;
>> };
>>
>> -static void acpi_scan_clear_dep_fn(struct work_struct *work)
>> +static void acpi_scan_clear_dep_fn(void *dev, async_cookie_t cookie)
>> {
>> - struct acpi_scan_clear_dep_work *cdw;
>> -
>> - cdw = container_of(work, struct acpi_scan_clear_dep_work, work);
>> + struct acpi_device *adev = to_acpi_device(dev);
>>
>> acpi_scan_lock_acquire();
>> - acpi_bus_attach(cdw->adev, (void *)true);
>> + acpi_bus_attach(adev, (void *)true);
>> acpi_scan_lock_release();
>>
>> - acpi_dev_put(cdw->adev);
>> - kfree(cdw);
>> + acpi_dev_put(adev);
>> }
>>
>> static bool acpi_scan_clear_dep_queue(struct acpi_device *adev)
>> {
>> - struct acpi_scan_clear_dep_work *cdw;
>> -
>> if (adev->dep_unmet)
>> return false;
>>
>> - cdw = kmalloc(sizeof(*cdw), GFP_KERNEL);
>> - if (!cdw)
>> - return false;
>> -
>> - cdw->adev = adev;
>> - INIT_WORK(&cdw->work, acpi_scan_clear_dep_fn);
>> /*
>> - * Since the work function may block on the lock until the entire
>> - * initial enumeration of devices is complete, put it into the unbound
>> - * workqueue.
>> + * Async schedule the deferred acpi_scan_clear_dep_fn() since:
>> + * - acpi_bus_attach() needs to hold acpi_scan_lock which cannot
>> + * be acquired under acpi_dep_list_lock (held here)
>> + * - the deferred work at boot stage is ensured to be finished
>> + * before entering init task by the async_synchronize_full()
>> + * barrier
>> + *
>> + * Use _nocall variant since it'll return on failure instead of
>> + * run the function synchronously.
>> */
>> - queue_work(system_dfl_wq, &cdw->work);
>> + if (!async_schedule_dev_nocall(acpi_scan_clear_dep_fn, &adev->dev))
>
> If the problem is that system_dfl_wq is too slow, why don't you just
> try a dedicated workqueue for this?
>
> There's no need to modify all of this code.
>
The problem is that these works are not finished before entering userspace,
so some key devices like console or PCIe nvme (for root device) is not ready
in time userspace init running.
If we use a dedicated workqueue we still need to do the synchronization
somewhere before entering the userspace to solve the problem. But that's just
what async_schedule* does: queue the function in async_wq (also an unbound one)
and wait finish before execute init process by async_synchronize_full(). Does
it make sense?
Thanks.
Powered by blists - more mailing lists