linux-kernel - Re: [RFC PATCH] workqueue: introduce queue_delayed_work_on_offline

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <eab73ac2-d9eb-4ac7-9e0f-9a02b81a31bb@oracle.com>
Date: Tue, 14 Jan 2025 14:01:12 +1100
From: imran.f.khan@...cle.com
To: Tejun Heo <tj@...nel.org>
Cc: jiangshanlai@...il.com, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] workqueue: introduce
 queue_delayed_work_on_offline_safe.

Hello Tejun,
Thanks for taking a look into it.
On 14/1/2025 5:21 am, Tejun Heo wrote:
> On Mon, Jan 13, 2025 at 03:35:40PM +1100, Imran Khan wrote:
> ...
>> I have kept the patch as RFC because from mailing list,
>> I could not find any users, of queue_delayed_work_on,
>> that is ending up queuing dwork on an offlined CPU.
>> We have some in-house code that is running into this problem,
>> and currently we are fixing it on caller side of queue_delayed_work_on.
>> Other users who run into this issue, can also use the approach of
>> fixing it on caller side or we can use the interface introduced
>> here for such use cases.
> 
> I'm not sure how necessary this is. If the timer is okay to run on other
> CPUs, might as well just use queue_delayed_work().
> 

Yes, right now I can't locate something in upstream kernel that gets
broken due to the issue mentioned here.
All (except 3, mentioned further down) users of queued_delayed_work_on
are using smp_processor_id(), to specify the CPU. So specified CPU can't
be an already offlined CPU.

I see below 3 files (in v6.12.6), using queue_delayed_work_on with some sort of cached
cpu information:

  drivers/net/ethernet/pensando/ionic/ionic_dev.c -> line 177
  drivers/scsi/esas2r/esas2r_main.c -> line 1858
  drivers/scsi/lpfc/lpfc_sli.c
       -> line 14987
       -> line 15381 
But looks like in these cases specified CPU remains online or
they simply have not encountered the issue mentioned here.

For this patch, yes the timer is okay to run on other CPUs but that is
only as a last resort, most of the times it could still run on specified
CPU (assuming its online)

Thanks,
Imran

> Thanks.
>