linux-kernel - Re: [workqueue] d5bff968ea: WARNING:at_kernel/workqueue.c:#process_one

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <a69a62ca-b7b1-bb4c-5c6f-8e186109b920@linux.intel.com>
Date:   Thu, 28 Jan 2021 15:52:40 +0800
From:   Xing Zhengjun <zhengjun.xing@...ux.intel.com>
To:     Hillf Danton <hdanton@...a.com>
Cc:     Oliver Sang <oliver.sang@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Lai Jiangshan <laijs@...ux.alibaba.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...el.com,
        lkp <lkp@...ts.01.org>
Subject: Re: [workqueue] d5bff968ea:
 WARNING:at_kernel/workqueue.c:#process_one_work



On 1/27/2021 5:21 PM, Hillf Danton wrote:
> On Wed, 27 Jan 2021 16:04:25 +0800 Xing Zhengjun wrote:
>> On 1/26/2021 3:39 PM, Hillf Danton wrote:
>>> On 26 Jan 2021 10:45:21 +0800 Xing Zhengjun wrote:
>>>> On 1/25/2021 5:29 PM, Hillf Danton wrote:
>>>>> On 25 Jan 2021 16:31:32 +0800 Xing Zhengjun wrote:
>>>>>> On 1/22/2021 3:59 PM, Hillf Danton wrote:
>>>>>>> On Fri, 22 Jan 2021 09:48:32 +0800 Xing Zhengjun wrote:
>>>>>>>> On 1/21/2021 12:00 PM, Hillf Danton wrote:
>>>>>>>>> On Wed, 20 Jan 2021 21:46:33 +0800 Oliver Sang wrote:
>>>>>>>>>> On Fri, Jan 15, 2021 at 03:24:32PM +0800, Hillf Danton wrote:
>>>>>>>>>>> Thu, 14 Jan 2021 15:45:11 +0800
>>>>>>>>>>>>
>>>>>>>>>>>> FYI, we noticed the following commit (built with gcc-9):
>>>>>>>>>>>>
>>>>>>>>>>>> commit: d5bff968ea9cc005e632d9369c26cbd8148c93d5 ("workqueue: break affinity initiatively")
>>>>>>>>>>>> https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git dev.2021.01.11b
>>>>>>>>>>>>
>>>>>>>>>>> [...]
>>>>>>>>>>>>
>>>>>>>>>>>> [   73.794288] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2192 process_one_work
>>>>>>>>>>>
>>>>>>>>>>> Thanks for your report.
>>>>>>>>>>>
>>>>>>>>>>> We can also break CPU affinity by checking POOL_DISASSOCIATED at attach
>>>>>>>>>>> time without extra cost paid; that way we have the same behavior as at
>>>>>>>>>>> the unbind time.
>>>>>>>>>>>
>>>>>>>>>>> What is more the change that makes kworker pcpu is cut because they are
>>>>>>>>>>> going to not help either hotplug or the mechanism of stop machine.
>>>>>>>>>>
>>>>>>>>>> hi, by applying below patch, the issue still happened.
>>>>>>>>>
>>>>>>>>> Thanks for your report.
>>>>>>>>>>
>>>>>>>>>> [ 4.574467] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
>>>>>>>>>> [ 4.575651] pci 0000:00:01.0: Activating ISA DMA hang workarounds
>>>>>>>>>> [ 4.576900] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
>>>>>>>>>> [ 4.578648] PCI: CLS 0 bytes, default 64
>>>>>>>>>> [ 4.579685] Unpacking initramfs...
>>>>>>>>>> [ 8.878031] -----------[ cut here ]-----------
>>>>>>>>>> [ 8.879083] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2187 process_one_work+0x92/0x9e0
>>>>>>>>>> [ 8.880688] Modules linked in:
>>>>>>>>>> [ 8.881274] CPU: 0 PID: 22 Comm: kworker/1:0 Not tainted 5.11.0-rc3-gc213503139bb #2
>>>>>>>>>
>>>>>>>>> The kworker bond to CPU1 runs on CPU0 and triggers the warning, which
>>>>>>>>> shows that scheduler breaks CPU affinity, after 06249738a41a
>>>>>>>>> ("workqueue: Manually break affinity on hotplug"), though quite likely
>>>>>>>>> by kworker/1:0 for the initial workers.
>>>>>>>>>
>>>>>>>>>> [ 8.882518] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>>>>>>>> [ 8.887539] Workqueue: 0x0 (events)
>>>>>>>>>> [ 8.887838] EIP: process_one_work+0x92/0x9e0
>>>>>>>>>> [ 8.887838] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 74 85 13 00 ff 05 b8 30 04 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>>>>>>>>>> [ 8.887838] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>>>>>>>>>> [ 8.887838] ESI: 43c04720 EDI: 42e45620 EBP: de7f23c0 ESP: 43d7bf08
>>>>>>>>>> [ 8.887838] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>>>>>>>>>> [ 8.887838] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
>>>>>>>>>> [ 8.887838] Call Trace:
>>>>>>>>>> [ 8.887838] ? worker_thread+0x98/0x6a0
>>>>>>>>>> [ 8.887838] ? worker_thread+0x2dd/0x6a0
>>>>>>>>>> [ 8.887838] ? kthread+0x1ba/0x1e0
>>>>>>>>>> [ 8.887838] ? create_worker+0x1e0/0x1e0
>>>>>>>>>> [ 8.887838] ? kzalloc+0x20/0x20
>>>>>>>>>> [ 8.887838] ? ret_from_fork+0x1c/0x28
>>>>>>>>>> [ 8.887838] _warn_unseeded_randomness: 63 callbacks suppressed
>>>>>>>>>> [ 8.887838] random: get_random_bytes called from init_oops_id+0x2b/0x60 with crng_init=0
>>>>>>>>>> [ 8.887838] --[ end trace ac461b4d54c37cfa ]--
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Instead of creating the initial workers only on the active CPUS, rebind
>>>>>>>>> them (labeled pcpu) and jump to the right CPU at bootup time.
>>>>>>>>>
>>>>>>>>> --- a/kernel/workqueue.c
>>>>>>>>> +++ b/kernel/workqueue.c
>>>>>>>>> @@ -2385,6 +2385,16 @@ woke_up:
>>>>>>>>>       		return 0;
>>>>>>>>>       	}
>>>>>>>>>       
>>>>>>>>> +	if (!(pool->flags & POOL_DISASSOCIATED) && smp_processor_id() !=
>>>>>>>>> +								pool->cpu) {
>>>>>>>>> +		/* scheduler breaks CPU affinity for us, rebind it */
>>>>>>>>> +		raw_spin_unlock_irq(&pool->lock);
>>>>>>>>> +		set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>>>>>>>> +		/* and jump to the right seat */
>>>>>>>>> +		schedule_timeout_interruptible(1);
>>>>>>>>> +		goto woke_up;
>>>>>>>>> +	}
>>>>>>>>> +
>>>>>>>>>       	worker_leave_idle(worker);
>>>>>>>>>       recheck:
>>>>>>>>>       	/* no more worker necessary? */
>>>>>>>>> --
>>>>>>>>>
>>>>>>>> I test the patch, the warning still appears in the kernel log.
>>>>>>>
>>>>>>> Thanks for your report.
>>>>>>>>
>>>>>>>> [  230.356503] smpboot: CPU 1 is now offline
>>>>>>>> [  230.544652] x86: Booting SMP configuration:
>>>>>>>> [  230.545077] smpboot: Booting Node 0 Processor 1 APIC 0x1
>>>>>>>> [  230.545640] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>>>>>>>> [  230.545675] masked ExtINT on CPU#1
>>>>>>>> [  230.593829] ------------[ cut here ]------------
>>>>>>>> [  230.594257] WARNING: CPU: 0 PID: 257 at kernel/workqueue.c:2192 process_one_work+0x92/0x9e0
>>>>>>>> [  230.594990] Modules linked in: rcutorture torture mousedev input_leds
>>>>>>>> led_class pcspkr psmouse evbug tiny_power_button button
>>>>>>>> [  230.595961] CPU: 0 PID: 257 Comm: kworker/1:3 Not tainted 5.11.0-rc3-gdcba55d9080f #2
>>>>>>>
>>>>>>> Like what was reported, kworker bond to CPU1 runs on CPU0 and triggers
>>>>>>> warning, due to scheduler breaking CPU affinity for us. What is new, the
>>>>>>> affinity was broken at offline time instead of bootup.
>>>>>>>
>>>>>>>> [  230.596621] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>>>>>> [  230.597322] Workqueue:  0x0 (rcu_gp)
>>>>>>>> [  230.597636] EIP: process_one_work+0x92/0x9e0
>>>>>>>> [  230.598005] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00
>>>>>>>> 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 f4 85 13 00 ff 05 cc 30 04
>>>>>>>> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>>>>>>>> [  230.599569] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>>>>>>>> [  230.600100] ESI: 43d94240 EDI: df4040f4 EBP: de7f23c0 ESP: bf5f1f08
>>>>>>>> [  230.600629] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>>>>>>>> [  230.601203] CR0: 80050033 CR2: 01bdecbc CR3: 04e2c000 CR4: 000406d0
>>>>>>>> [  230.601735] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>>>>>>>> [  230.602265] DR6: fffe0ff0 DR7: 00000400
>>>>>>>> [  230.602594] Call Trace:
>>>>>>>> [  230.602813]  ? process_one_work+0x20e/0x9e0
>>>>>>>> [  230.603181]  ? worker_thread+0x32d/0x700
>>>>>>>> [  230.603522]  ? kthread+0x1ba/0x1e0
>>>>>>>> [  230.603818]  ? create_worker+0x1e0/0x1e0
>>>>>>>> [  230.604157]  ? kzalloc+0x20/0x20
>>>>>>>> [  230.604524]  ? ret_from_fork+0x1c/0x28
>>>>>>>> [  230.604850] ---[ end trace 06b1e66b5e17fa85 ]---
>>>>>>>> [  230.605504] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
>>>>>>>> [  230.766960] smpboot: CPU 1 is now offline
>>>>>>>> [  230.814803] x86: Booting SMP configuration:
>>>>>>>> [  230.815306] smpboot: Booting Node 0 Processor 1 APIC 0x1
>>>>>>>> [  230.815964] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>>>>>>>
>>>>>>>
>>>>>>> Unlike the above diff that is at most papering over the problem
>>>>>>> sitting somewhere in the scheduler, add change to creating worker
>>>>>>> by skipping set_cpus_allowed_ptr() because we will wake it up after
>>>>>>> attaching it to worker pool.
>>>>>>>
>>>>>>> If we can ignore rescuer for now, then the allowed ptr is only
>>>>>>> updated at on/offline time; lets see the difference at boot time.
>>>>>>>
>>>>>>>
>>>>>>> --- a/kernel/workqueue.c
>>>>>>> +++ b/kernel/workqueue.c
>>>>>>> @@ -1844,16 +1844,10 @@ static struct worker *alloc_worker(int n
>>>>>>>       * cpu-[un]hotplugs.
>>>>>>>       */
>>>>>>>      static void worker_attach_to_pool(struct worker *worker,
>>>>>>> -				   struct worker_pool *pool)
>>>>>>> +				   struct worker_pool *pool,
>>>>>>> +				   int update_cpus_allowed)
>>>>>>>      {
>>>>>>>      	mutex_lock(&wq_pool_attach_mutex);
>>>>>>> -
>>>>>>> -	/*
>>>>>>> -	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
>>>>>>> -	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
>>>>>>> -	 */
>>>>>>> -	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>>>>>> -
>>>>>>>      	/*
>>>>>>>      	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>>>>>>>      	 * stable across this function.  See the comments above the flag
>>>>>>> @@ -1867,6 +1861,9 @@ static void worker_attach_to_pool(struct
>>>>>>>      	list_add_tail(&worker->node, &pool->workers);
>>>>>>>      	worker->pool = pool;
>>>>>>>      
>>>>>>> +	if (update_cpus_allowed)
>>>>>>> +		set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>>>>>> +
>>>>>>>      	mutex_unlock(&wq_pool_attach_mutex);
>>>>>>>      }
>>>>>>>      
>>>>>>> @@ -1942,8 +1939,11 @@ static struct worker *create_worker(stru
>>>>>>>      	set_user_nice(worker->task, pool->attrs->nice);
>>>>>>>      	kthread_bind_mask(worker->task, pool->attrs->cpumask);
>>>>>>>      
>>>>>>> -	/* successful, attach the worker to the pool */
>>>>>>> -	worker_attach_to_pool(worker, pool);
>>>>>>> +	/*
>>>>>>> +	 * attach the worker to the pool without asking scheduler to
>>>>>>> +	 * update CPUs allowed
>>>>>>> +	 */
>>>>>>> +	worker_attach_to_pool(worker, pool, 0);
>>>>>>>      
>>>>>>>      	/* start the newly created worker */
>>>>>>>      	raw_spin_lock_irq(&pool->lock);
>>>>>>> @@ -2508,7 +2508,7 @@ repeat:
>>>>>>>      
>>>>>>>      		raw_spin_unlock_irq(&wq_mayday_lock);
>>>>>>>      
>>>>>>> -		worker_attach_to_pool(rescuer, pool);
>>>>>>> +		worker_attach_to_pool(rescuer, pool, 1);
>>>>>>>      
>>>>>>>      		raw_spin_lock_irq(&pool->lock);
>>>>>>>      
>>>>>>> --
>>>>>>>
>>>>>> I test the patch, the warning still appears in the kernel log.
>>>>>
>>>>> Thanks.
>>>>>>
>>>>>> [   55.754187] smpboot: Booting Node 0 Processor 1 APIC 0x1
>>>>>> [   55.785594] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>>>>>> [   55.785646] masked ExtINT on CPU#1
>>>>>> [   55.920602] ------------[ cut here ]------------
>>>>>> [   55.921355] WARNING: CPU: 0 PID: 160 at kernel/workqueue.c:2192 process_one_work+0x92/0x9e0
>>>>>> [   55.922583] Modules linked in: rcutorture torture mousedev evbug
>>>>>> input_leds led_class tiny_power_button psmouse pcspkr button
>>>>>> [   55.924294] CPU: 0 PID: 160 Comm: kworker/1:2 Not tainted 5.11.0-rc3-00186-g77bf4e461cfa #2
>>>>>
>>>>> Same issue as before.
>>>>>
>>>>>> [   55.925552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>>>> [   55.926763] Workqueue:  0x0 (rcu_gp)
>>>>>> [   55.927298] EIP: process_one_work+0x92/0x9e0
>>>>>> [   55.927950] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00
>>>>>> 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 94 85 13 00 ff 05 b8 30 04
>>>>>> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>>>>>> [   55.930726] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>>>>>> [   55.931642] ESI: 43d90540 EDI: df48c0f4 EBP: de7f23c0 ESP: bfb47f08
>>>>>> [   55.932590] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>>>>>> [   55.933609] CR0: 80050033 CR2: 024e994c CR3: 7fd80000 CR4: 000406d0
>>>>>> [   55.934555] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>>>>>> [   55.935457] DR6: fffe0ff0 DR7: 00000400
>>>>>> [   55.936041] Call Trace:
>>>>>> [   55.936534]  ? process_one_work+0x20e/0x9e0
>>>>>> [   55.937305]  ? worker_thread+0x2dd/0x6a0
>>>>>> [   55.938018]  ? kthread+0x1ba/0x1e0
>>>>>> [   55.938598]  ? create_worker+0x1e0/0x1e0
>>>>>> [   55.939315]  ? kzalloc+0x20/0x20
>>>>>> [   55.940000]  ? ret_from_fork+0x1c/0x28
>>>>>> [   55.940627] ---[ end trace d155e9e6402de179 ]---
>>>>>> [   55.941641] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
>>>>>> [   56.155271] smpboot: CPU 1 is now offline
>>>>>> [   56.193613] x86: Booting SMP configuration:
>>>>>> [   56.194400] smpboot: Booting Node 0 Processor 1 APIC 0x1
>>>>>
>>>>> The changes in the diff below are
>>>>>
>>>>> 1/ at rescue time, change CPU affinity only if POOL_DISASSOCIATED
>>>>> is not set, and print warning the same way as offline time.
>>>>>
>>>>> 2/ at offine time, dont update allowed CPUs after setting
>>>>> POOL_DISASSOCIATED because we no longer have interest in affinity.
>>>>>
>>>>> 3/ at online time, mark pcpu before binding affinity.
>>>>>
>>>>> Though one change a diff is appreciated, by the WARNs, we can tell
>>>>> which is what if any warning goes into dmesg.
>>>>>
>>>>> --- a/kernel/workqueue.c
>>>>> +++ b/kernel/workqueue.c
>>>>> @@ -1844,25 +1844,23 @@ static struct worker *alloc_worker(int n
>>>>>      * cpu-[un]hotplugs.
>>>>>      */
>>>>>     static void worker_attach_to_pool(struct worker *worker,
>>>>> -				   struct worker_pool *pool)
>>>>> +				   struct worker_pool *pool, int set)
>>>>>     {
>>>>>     	mutex_lock(&wq_pool_attach_mutex);
>>>>>     
>>>>>     	/*
>>>>> -	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
>>>>> -	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
>>>>> -	 */
>>>>> -	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>>>> -
>>>>> -	/*
>>>>>     	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>>>>>     	 * stable across this function.  See the comments above the flag
>>>>>     	 * definition for details.
>>>>>     	 */
>>>>>     	if (pool->flags & POOL_DISASSOCIATED)
>>>>>     		worker->flags |= WORKER_UNBOUND;
>>>>> -	else
>>>>> +	else {
>>>>>     		kthread_set_per_cpu(worker->task, true);
>>>>> +		if (set)
>>>>> +			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
>>>>> +						pool->attrs->cpumask) < 0);
>>>>> +	}
>>>>>     
>>>>>     	list_add_tail(&worker->node, &pool->workers);
>>>>>     	worker->pool = pool;
>>>>> @@ -1943,7 +1941,7 @@ static struct worker *create_worker(stru
>>>>>     	kthread_bind_mask(worker->task, pool->attrs->cpumask);
>>>>>     
>>>>>     	/* successful, attach the worker to the pool */
>>>>> -	worker_attach_to_pool(worker, pool);
>>>>> +	worker_attach_to_pool(worker, pool, 0);
>>>>>     
>>>>>     	/* start the newly created worker */
>>>>>     	raw_spin_lock_irq(&pool->lock);
>>>>> @@ -2508,7 +2506,7 @@ repeat:
>>>>>     
>>>>>     		raw_spin_unlock_irq(&wq_mayday_lock);
>>>>>     
>>>>> -		worker_attach_to_pool(rescuer, pool);
>>>>> +		worker_attach_to_pool(rescuer, pool, 1);
>>>>>     
>>>>>     		raw_spin_lock_irq(&pool->lock);
>>>>>     
>>>>> @@ -4923,7 +4921,6 @@ static void unbind_workers(int cpu)
>>>>>     
>>>>>     		for_each_pool_worker(worker, pool) {
>>>>>     			kthread_set_per_cpu(worker->task, false);
>>>>> -			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
>>>>>     		}
>>>>>     
>>>>>     		mutex_unlock(&wq_pool_attach_mutex);
>>>>> @@ -4977,9 +4974,9 @@ static void rebind_workers(struct worker
>>>>>     	 * from CPU_ONLINE, the following shouldn't fail.
>>>>>     	 */
>>>>>     	for_each_pool_worker(worker, pool) {
>>>>> +		kthread_set_per_cpu(worker->task, true);
>>>>>     		WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
>>>>>     						  pool->attrs->cpumask) < 0);
>>>>> -		kthread_set_per_cpu(worker->task, true);
>>>>>     	}
>>>>>     
>>>>>     	raw_spin_lock_irq(&pool->lock);
>>>>> --
>>>>>
>>>> I test the patch, the warning still appears in the kernel log, but the
>>>
>>> Thanks.
>>>
>>>> warning is different from before.
>>>>
>>>> [    0.054803] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>>>> [    0.054803] masked ExtINT on CPU#1
>>>> [    0.054803] smpboot: CPU 1 Converting physical 0 to logical die 1
>>>> [    1.890338] ------------[ cut here ]------------
>>>> [    1.890338] WARNING: CPU: 1 PID: 18 at kernel/kthread.c:508 kthread_set_per_cpu+0x156/0x180
>>>> [    1.890338] Modules linked in:
>>>> [    1.890338] CPU: 1 PID: 18 Comm: cpuhp/1 Not tainted 5.11.0-rc3-00186-ged03082352b2 #2
>>>
>>> This warning is better than the one in process_one_work() which is too
>>> late, though, it has something to say.
>>>
>>>> [    1.890338] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>> [    1.890338] EIP: kthread_set_per_cpu+0x156/0x180
>>>> [    1.890338] Code: 00 00 00 00 ff 05 68 4e 04 43 83 c4 08 5b 5e 5f c3
>>>> 8d 76 00 ff 05 34 50 04 43 0f 0b e9 f9 fe ff ff 8d 76 00 ff 05 2c 4e 04
>>>> 43 <0f> 0b eb 9d 8d b6 00 00 00 00 ff 05 40 4e 04 43 0f 0b e9 45 ff ff
>>>> [    1.890338] EAX: 42f52ce0 EBX: 00000001 ECX: 00000000 EDX: 00000001
>>>> [    1.890338] ESI: 43d76300 EDI: 43c0de00 EBP: de7f2564 ESP: 43d6beb8
>>>> [    1.900350] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010202
>>>> [    1.901303] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
>>>> [    1.902280] Call Trace:
>>>> [    1.902682]  ? workqueue_online_cpu+0x12b/0x640
>>>> [    1.903415]  ? workqueue_prepare_cpu+0xa0/0xa0
>>>
>>> By the prepare function we know it is a warning at boot time.
>>>
>>>> [    1.904155]  ? cpuhp_invoke_callback+0x1ed/0x1340
>>>> [    1.904941]  ? cpuhp_thread_fun+0x28f/0x460
>>>> [    1.905630]  ? cpuhp_thread_fun+0x49/0x460
>>>> [    1.906298]  ? smpboot_thread_fn+0x446/0x620
>>>> [    1.910275]  ? kthread+0x1ba/0x1e0
>>>> [    1.910857]  ? __smpboot_create_thread+0x260/0x260
>>>> [    1.911659]  ? kzalloc+0x20/0x20
>>>> [    1.912368]  ? ret_from_fork+0x1c/0x28
>>>> [    1.913016] ---[ end trace 6f6c005278241eba ]---
>>>> [    1.913971] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
>>>> [    1.920012] smp: Brought up 1 node, 2 CPUs
>>>> [    1.920299] smpboot: Max logical packages: 2
>>>> [    1.921019] smpboot: Total of 2 processors activated (10774.03 BogoMIPS)
>>>
>>> After staring at it again, the change added in rebind_workers() is
>>> incorrect for a CPU getting back online after bootup, because the
>>> CPU affinity has been broken at offline time.
>>>
>>> But the warning in the latest report occurs at boot time, after going
>>> through steps like,
>>>
>>> 	kthread_create_on_node();
>>>
>>> 	kthread_bind_mask(worker->task, pool->attrs->cpumask);
>>>
>>> 	/* successful, attach the worker to the pool */
>>> 	worker_attach_to_pool(worker, pool, 0);
>>>
>>> 	/* start the newly created worker */
>>> 	wake_up_process(worker->task);
>>>
>>>
>>> without triggering the warning at kernel/kthread.c:507.
>>>
>>> Before finding an answer (wake_up_new_task()) to the question,
>>> who changed the allowed ptr?  revert the change at online time,
>>> and as a compensation, clear POOL_DISASSOCIATED as early as we can.
>>>
>>>
>>> --- a/kernel/workqueue.c
>>> +++ b/kernel/workqueue.c
>>> @@ -1844,25 +1844,23 @@ static struct worker *alloc_worker(int n
>>>     * cpu-[un]hotplugs.
>>>     */
>>>    static void worker_attach_to_pool(struct worker *worker,
>>> -				   struct worker_pool *pool)
>>> +				   struct worker_pool *pool, int set)
>>>    {
>>>    	mutex_lock(&wq_pool_attach_mutex);
>>>    
>>>    	/*
>>> -	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
>>> -	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
>>> -	 */
>>> -	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>> -
>>> -	/*
>>>    	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>>>    	 * stable across this function.  See the comments above the flag
>>>    	 * definition for details.
>>>    	 */
>>>    	if (pool->flags & POOL_DISASSOCIATED)
>>>    		worker->flags |= WORKER_UNBOUND;
>>> -	else
>>> +	else {
>>>    		kthread_set_per_cpu(worker->task, true);
>>> +		if (set)
>>> +			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
>>> +						pool->attrs->cpumask) < 0);
>>> +	}
>>>    
>>>    	list_add_tail(&worker->node, &pool->workers);
>>>    	worker->pool = pool;
>>> @@ -1943,7 +1941,7 @@ static struct worker *create_worker(stru
>>>    	kthread_bind_mask(worker->task, pool->attrs->cpumask);
>>>    
>>>    	/* successful, attach the worker to the pool */
>>> -	worker_attach_to_pool(worker, pool);
>>> +	worker_attach_to_pool(worker, pool, 0);
>>>    
>>>    	/* start the newly created worker */
>>>    	raw_spin_lock_irq(&pool->lock);
>>> @@ -2508,7 +2506,7 @@ repeat:
>>>    
>>>    		raw_spin_unlock_irq(&wq_mayday_lock);
>>>    
>>> -		worker_attach_to_pool(rescuer, pool);
>>> +		worker_attach_to_pool(rescuer, pool, 1);
>>>    
>>>    		raw_spin_lock_irq(&pool->lock);
>>>    
>>> @@ -4923,7 +4921,6 @@ static void unbind_workers(int cpu)
>>>    
>>>    		for_each_pool_worker(worker, pool) {
>>>    			kthread_set_per_cpu(worker->task, false);
>>> -			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
>>>    		}
>>>    
>>>    		mutex_unlock(&wq_pool_attach_mutex);
>>> @@ -5954,6 +5951,7 @@ void __init workqueue_init_early(void)
>>>    		i = 0;
>>>    		for_each_cpu_worker_pool(pool, cpu) {
>>>    			BUG_ON(init_worker_pool(pool));
>>> +			pool->flags &= ~POOL_DISASSOCIATED;
>>>    			pool->cpu = cpu;
>>>    			cpumask_copy(pool->attrs->cpumask, cpumask_of(cpu));
>>>    			pool->attrs->nice = std_nice[i++];
>>> --
>>>
>> I test the patch, the warning still appears in the kernel log.
> 
> Thanks.
>>
>> [    0.043584] masked ExtINT on CPU#1
>> [    0.043584] smpboot: CPU 1 Converting physical 0 to logical die 1
>> [    1.681867] ------------[ cut here ]------------
>> [    1.681867] WARNING: CPU: 1 PID: 18 at kernel/workqueue.c:5015 workqueue_online_cpu+0x336/0x640
> 
> This is a new one at boot time.
> 
>> [    1.681867] Modules linked in:
>> [    1.681867] CPU: 1 PID: 18 Comm: cpuhp/1 Not tainted 5.11.0-rc3-00186-g178826956ca7 #2
>> [    1.681867] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>> [    1.681867] EIP: workqueue_online_cpu+0x336/0x640
>> [    1.681867] Code: 01 00 00 00 eb 89 8d b6 00 00 00 00 8b 5c 24 0c 89
>> d8 e8 3d 39 19 01 e9 1c fd ff ff 8d b4 26 00 00 00 00 90 ff 05 dc 20 04
>> 43 <0f> 0b e9 07 ff ff ff 8d 76 00 ff 05 2c 21 04 43 0f 0b e9 12 fe ff
>> [    1.681867] EAX: 42f51168 EBX: 0000000c ECX: 00000000 EDX: 00000001
>> [    1.681867] ESI: 00000000 EDI: 43c04720 EBP: de7f2564 ESP: 43d6bed0
>> [    1.681867] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>> [    1.681867] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
>> [    1.681867] Call Trace:
>> [    1.681867]  ? workqueue_prepare_cpu+0xa0/0xa0
>> [    1.681867]  ? cpuhp_invoke_callback+0x1ed/0x1340
>> [    1.681867]  ? cpuhp_thread_fun+0x28f/0x460
>> [    1.681867]  ? cpuhp_thread_fun+0x49/0x460
>> [    1.681867]  ? smpboot_thread_fn+0x446/0x620
>> [    1.681867]  ? kthread+0x1ba/0x1e0
>> [    1.681867]  ? __smpboot_create_thread+0x260/0x260
>> [    1.681867]  ? kzalloc+0x20/0x20
>> [    1.681867]  ? ret_from_fork+0x1c/0x28
>> [    1.681867] ---[ end trace 3e853d8ad82e44ac ]---
>> [    1.691939] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
>> [    1.705909] smp: Brought up 1 node, 2 CPUs
>> [    1.706605] smpboot: Max logical packages: 2
>>
>> [   85.402803] smpboot: Booting Node 0 Processor 1 APIC 0x1
>> [   85.442465] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>> [   85.442602] masked ExtINT on CPU#1
>> [   85.541391] ------------[ cut here ]------------
>> [   85.542250] WARNING: CPU: 1 PID: 18 at kernel/kthread.c:508 kthread_set_per_cpu+0x156/0x180
> 
> This is a known one.
> 
>> [   85.543664] Modules linked in: rcutorture torture mousedev evbug
>> input_leds led_class psmouse pcspkr tiny_power_button button
>> [   85.545530] CPU: 1 PID: 18 Comm: cpuhp/1 Tainted: G        W
>> 5.11.0-rc3-00186-g178826956ca7 #2
>> [   85.547006] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS 1.12.0-1 04/01/2014
>> [   85.548454] EIP: kthread_set_per_cpu+0x156/0x180
>> [   85.549231] Code: 00 00 00 00 ff 05 68 4e 04 43 83 c4 08 5b 5e 5f c3
>> 8d 76 00 ff 05 34 50 04 43 0f 0b e9 f9 fe ff ff 8d 76 00 ff 05 2c 4e 04
>> 43 <0f> 0b eb 9d 8d b6 00 00 00 00 ff 05 40 4e 04 43 0f 0b e9 45 ff ff
>> [   85.552236] EAX: 42f52ce0 EBX: 00000001 ECX: 00000000 EDX: 00000001
>> [   85.556663] ESI: bfb70000 EDI: 45e61000 EBP: de7f2564 ESP: 43d6beb8
>> [   85.559168] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010202
>> [   85.561991] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
>> [   85.563040] Call Trace:
>> [   85.563463]  ? workqueue_online_cpu+0x17f/0x640
>> [   85.564222]  ? rapl_cpu_online+0xb7/0x1a0
>> [   85.564904]  ? workqueue_prepare_cpu+0xa0/0xa0
>> [   85.565643]  ? cpuhp_invoke_callback+0x1ed/0x1340
>> [   85.566424]  ? cpuhp_thread_fun+0x28f/0x460
>> [   85.567140]  ? cpuhp_thread_fun+0x49/0x460
>> [   85.567844]  ? smpboot_thread_fn+0x446/0x620
>> [   85.568569]  ? kthread+0x1ba/0x1e0
>> [   85.569162]  ? __smpboot_create_thread+0x260/0x260
>> [   85.569961]  ? kzalloc+0x20/0x20
>> [   85.570918]  ? ret_from_fork+0x1c/0x28
>> [   85.575418] ---[ end trace 3e853d8ad82e44ae ]---
>> [   85.577919] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
>> [   85.580973] ------------[ cut here ]------------
>> [   85.581747] WARNING: CPU: 0 PID: 87 at kernel/workqueue.c:2190 process_one_work+0x92/0x9e0
>> [   85.583058] Modules linked in: rcutorture torture mousedev evbug
>> input_leds led_class psmouse pcspkr tiny_power_button button
>> [   85.584850] CPU: 0 PID: 87 Comm: kworker/1:2 Tainted: G        W 5.11.0-rc3-00186-g178826956ca7 #2
> 
> This is a big regression.
> 
>> [   85.586456] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS 1.12.0-1 04/01/2014
>> [   85.587753] Workqueue:  0x0 (rcu_gp)
>> [   85.588347] EIP: process_one_work+0x92/0x9e0
>> [   85.589018] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00
>> 00 00 c7 04 24 01 00 00 00 b8 d8 1c f5 42 e8 14 85 13 00 ff 05 a4 30 04
>> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>> [   85.595358] EAX: 42f51cd8 EBX: df4080f8 ECX: 00000000 EDX: 00000001
>> [   85.596319] ESI: 43d94840 EDI: df4080f4 EBP: de7f23c0 ESP: 45505f08
>> [   85.597291] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>> [   85.598410] CR0: 80050033 CR2: 372c2004 CR3: 7fa9f000 CR4: 000406d0
>> [   85.599401] Call Trace:
>> [   85.599804]  ? process_one_work+0x20e/0x9e0
>> [   85.600474]  ? worker_thread+0x40d/0x6a0
>> [   85.601096]  ? kthread+0x1ba/0x1e0
>> [   85.601635]  ? create_worker+0x1e0/0x1e0
>> [   85.602246]  ? kzalloc+0x20/0x20
>> [   85.602907]  ? ret_from_fork+0x1c/0x28
>> [   85.603512] ---[ end trace 3e853d8ad82e44af ]---
>> [   85.699976] smpboot: CPU 1 is now offline
>> [   85.771891] x86: Booting SMP configuration:
>> [   85.772532] smpboot: Booting Node 0 Processor 1 APIC 0x1
> 
> 1/ to fix the regression, roll back to the version without warning in
> process_one_work().
> 
> 2/ to see if that version is utterly anything of a blind cat heading
> on a blind rat, quiesce the 508 warning for kworkers now.
> 
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1844,25 +1844,23 @@ static struct worker *alloc_worker(int n
>    * cpu-[un]hotplugs.
>    */
>   static void worker_attach_to_pool(struct worker *worker,
> -				   struct worker_pool *pool)
> +				   struct worker_pool *pool, int set)
>   {
>   	mutex_lock(&wq_pool_attach_mutex);
>   
>   	/*
> -	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> -	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
> -	 */
> -	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> -
> -	/*
>   	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>   	 * stable across this function.  See the comments above the flag
>   	 * definition for details.
>   	 */
>   	if (pool->flags & POOL_DISASSOCIATED)
>   		worker->flags |= WORKER_UNBOUND;
> -	else
> +	else {
>   		kthread_set_per_cpu(worker->task, true);
> +		if (set)
> +			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
> +						pool->attrs->cpumask) < 0);
> +	}
>   
>   	list_add_tail(&worker->node, &pool->workers);
>   	worker->pool = pool;
> @@ -1943,7 +1941,7 @@ static struct worker *create_worker(stru
>   	kthread_bind_mask(worker->task, pool->attrs->cpumask);
>   
>   	/* successful, attach the worker to the pool */
> -	worker_attach_to_pool(worker, pool);
> +	worker_attach_to_pool(worker, pool, 0);
>   
>   	/* start the newly created worker */
>   	raw_spin_lock_irq(&pool->lock);
> @@ -2508,7 +2506,7 @@ repeat:
>   
>   		raw_spin_unlock_irq(&wq_mayday_lock);
>   
> -		worker_attach_to_pool(rescuer, pool);
> +		worker_attach_to_pool(rescuer, pool, 1);
>   
>   		raw_spin_lock_irq(&pool->lock);
>   
> @@ -4923,7 +4921,6 @@ static void unbind_workers(int cpu)
>   
>   		for_each_pool_worker(worker, pool) {
>   			kthread_set_per_cpu(worker->task, false);
> -			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
>   		}
>   
>   		mutex_unlock(&wq_pool_attach_mutex);
> @@ -4977,9 +4974,9 @@ static void rebind_workers(struct worker
>   	 * from CPU_ONLINE, the following shouldn't fail.
>   	 */
>   	for_each_pool_worker(worker, pool) {
> +		kthread_set_per_cpu(worker->task, true);
>   		WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
>   						  pool->attrs->cpumask) < 0);
> -		kthread_set_per_cpu(worker->task, true);
>   	}
>   
>   	raw_spin_lock_irq(&pool->lock);
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -505,7 +505,7 @@ void kthread_set_per_cpu(struct task_str
>   
>   	if (set) {
>   		WARN_ON_ONCE(!(k->flags & PF_NO_SETAFFINITY));
> -		WARN_ON_ONCE(k->nr_cpus_allowed != 1);
> +		WARN_ON_ONCE(k->nr_cpus_allowed != 1 && !(k->flags & PF_WQ_WORKER));
>   		set_bit(KTHREAD_IS_PER_CPU, &kthread->flags);
>   	} else {
>   		clear_bit(KTHREAD_IS_PER_CPU, &kthread->flags);
> --
> 

I test the patch 4 times, no warning appears in the kernel log.

-- 
Zhengjun Xing

Download attachment "dmesg.xz" of type "application/octet-stream" (38184 bytes)