lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 7 Jul 2020 11:47:24 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     "Zhang, Qiang" <Qiang.Zhang@...driver.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: 回复: [kthread] a90477f0c9:
 WARNING:at_kernel/kthread.c:#kthread_queue_work

On Mon 2020-07-06 10:17:31, Zhang, Qiang wrote:
> Hi , Petr Mladek
> There some question for "Work could not be queued when worker being destroyed" patch,
> 
> when in "spi_init_queue" func :
> "kthread_init_worker(&ctlr->kworker);     (worker->task = NULL)
>  ctlr->kworker_task = kthread_run(kthread_worker_fn, &ctlr->kworker,
>                                          "%s", dev_name(&ctlr->dev)); "

I see. I have missed that there are some kthread_worker users that
start the worker this way. They rely on the fact that
worker->task is set also by kthread_worker_fn.

The proper solution is to start the worker using either
kthread_create_worker() or kthread_create_worker_on_cpu().
They set worker->task immediately.

It means that more work is needed:

1. Convert all users that start the kthread_worker via
   kthread_worker_fn to use either kthread_create_worker()
   or kthread_create_worker_on_cpu().

2. Remove kthread_worker_fn declaration from include/linux/kthread.h
   to prevent starting the worker the temporary way.

   In the same patch, also the assignment to worker->task and
   the FIXME might get removed from kthread_worker_fn().


3. Finally, it should be safe to add the WARN_ON() into
   queuing_blocked().

Best Regards,
Petr

> 
>  in "spi_start_queue" func:
>  "kthread_queue_work(&ctlr->kworker, &ctlr->pump_messages);"
> 
>  Becasue the kthread_worker_fn is not begin running,  if queue work to worker,
>  the "!worker->task" = true, trigger WARN.
> 
>  Are Need to add judgment " test_bit(KTHREAD_SHOULD_STOP,  &to_kthread(current)->flags) && WARN_ON(!worker->task)" in queuing_blocked func ?
> 
>  Zhang Qiang
> 
>  
> 
> 
> ________________________________________
> 发件人: kernel test robot <rong.a.chen@...el.com>
> 发送时间: 2020年7月6日 17:38
> 收件人: Zhang, Qiang
> 抄送: lkp@...ts.01.org
> 主题: [kthread] a90477f0c9: WARNING:at_kernel/kthread.c:#kthread_queue_work
> 
> Greeting,
> 
> FYI, we noticed the following commit (built with gcc-7):
> 
> commit: a90477f0c956621eb0dd69f0abfb6066ad8fbef7 ("kthread: work could not be queued when worker being destroyed")
> https://github.com/hnaz/linux-mm master
> 
> in testcase: trinity
> with following parameters:
> 
>         runtime: 300s
> 
> test-description: Trinity is a linux system call fuzz tester.
> test-url: http://codemonkey.org.uk/projects/trinity/
> 
> 
> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> 
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> 
> 
> +-------------------------------------------------+------------+------------+
> |                                                 | 85c6127e06 | a90477f0c9 |
> +-------------------------------------------------+------------+------------+
> | boot_successes                                  | 6          | 0          |
> | boot_failures                                   | 0          | 30         |
> | WARNING:at_kernel/kthread.c:#kthread_queue_work | 0          | 30         |
> | EIP:kthread_queue_work                          | 0          | 30         |
> | BUG:kernel_hang_in_test_stage                   | 0          | 2          |
> +-------------------------------------------------+------------+------------+
> 
> 
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <rong.a.chen@...el.com>
> 
> 
> [    5.554282] WARNING: CPU: 0 PID: 1 at kernel/kthread.c:817 kthread_queue_work+0xf8/0x120
> [    5.556204] Modules linked in:
> [    5.556204] CPU: 0 PID: 1 Comm: swapper Tainted: G S                5.8.0-rc3-00014-ga90477f0c9566 #1
> [    5.556204] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> [    5.556204] EIP: kthread_queue_work+0xf8/0x120
> [    5.556204] Code: 00 59 e9 67 ff ff ff 8d 76 00 8b 4e 10 85 c9 75 a6 8d 4b 28 89 f2 89 d8 bf 01 00 00 00 e8 f0 f5 ff ff eb 93 8d b6 00 00 00 00 <0f> 0b 6a 00 31 c9 ba 01 00 00 00 b8 08 6c 64 c3 e8 f3 04 0b 00 5b
> [    5.556204] EAX: 00000000 EBX: eeff538c ECX: 00000000 EDX: 00000001
> [    5.556204] ESI: eeff53d0 EDI: 00000000 EBP: f5edde70 ESP: f5edde5c
> [    5.556204] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010046
> [    5.556204] CR0: 80050033 CR2: b7eda844 CR3: 038ae000 CR4: 000406d0
> [    5.556204] Call Trace:
> [    5.556204]  spi_start_queue+0x50/0x70
> [    5.556204]  spi_register_controller+0x642/0xa80
> [    5.556204]  ? kobject_get+0x54/0xf0
> [    5.556204]  ? parport_pc_platform_probe+0x10/0x10
> [    5.556204]  spi_bitbang_start+0x2f/0x70
> [    5.556204]  ? parport_pc_platform_probe+0x10/0x10
> [    5.556204]  butterfly_attach+0x164/0x2c0
> [    5.556204]  ? driver_detach+0x30/0x30
> [    5.556204]  port_check+0x1c/0x30
> [    5.556204]  bus_for_each_dev+0x5a/0x90
> [    5.556204]  __parport_register_driver+0x76/0xa0
> [    5.556204]  ? driver_detach+0x30/0x30
> [    5.556204]  ? spi_engine_driver_init+0x16/0x16
> [    5.556204]  butterfly_init+0x19/0x1b
> [    5.556204]  do_one_initcall+0x79/0x310
> [    5.556204]  ? parse_args+0x70/0x420
> [    5.556204]  ? rcu_read_lock_sched_held+0x2f/0x50
> [    5.556204]  ? trace_initcall_level+0x95/0xc7
> [    5.556204]  ? kernel_init_freeable+0x129/0x19f
> [    5.556204]  kernel_init_freeable+0x148/0x19f
> [    5.556204]  ? rest_init+0x100/0x100
> [    5.556204]  kernel_init+0xd/0xf0
> [    5.556204]  ret_from_fork+0x1c/0x28
> [    5.556204] irq event stamp: 8620410
> [    5.556204] hardirqs last  enabled at (8620409): [<c2675daa>] _raw_spin_unlock_irqrestore+0x2a/0x50
> [    5.556204] hardirqs last disabled at (8620410): [<c2675bb4>] _raw_spin_lock_irqsave+0x14/0x40
> [    5.556204] softirqs last  enabled at (8620262): [<c2677900>] __do_softirq+0x2e0/0x480
> [    5.556204] softirqs last disabled at (8620249): [<c1023cf5>] call_on_stack+0x45/0x50
> [    5.556204] ---[ end trace 184061a442b5fd6f ]---
> 
> 
> To reproduce:
> 
>         # build kernel
>         cd linux
>         cp config-5.8.0-rc3-00014-ga90477f0c9566 .config
>         make HOSTCC=gcc-7 CC=gcc-7 ARCH=i386 olddefconfig prepare modules_prepare bzImage
> 
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
> 
> 
> 
> Thanks,
> Rong Chen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ