lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 3 Mar 2015 11:00:43 +0100
From:	Tomeu Vizoso <tomeu.vizoso@...il.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Jesper Nilsson <jesper.nilsson@...s.com>,
	Rabin Vincent <rabinv@...s.com>,
	Jesper Nilsson <jespern@...s.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH for-3.20-fixes] workqueue: fix hang involving racing
 cancel[_delayed]_work_sync()'s for PREEMPT_NONE

On 2 March 2015 at 17:21, Tejun Heo <tj@...nel.org> wrote:
> On Mon, Mar 02, 2015 at 01:26:15PM +0100, Jesper Nilsson wrote:
>> On Mon, Feb 09, 2015 at 05:15:27PM +0100, Tejun Heo wrote:
>> > Hello,
>>
>> Hi!
>>
>> > This patch removes the possible hang by updating __cancel_work_timer()
>> > to explicitly wait for clearing of CANCELING rather than invoking
>> > flush_work() after try_to_grab_pending() fails with -ENOENT.  The
>> > explicit wait uses the matching bit waitqueue for the CANCELING bit.
>> >
>> > Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com
>> >
>> > Signed-off-by: Tejun Heo <tj@...nel.org>
>> > Reported-by: Rabin Vincent <rabin.vincent@...s.com>
>> > Cc: stable@...r.kernel.org
>>
>> What's the status on this patch, it's not in 4.0-rc1 at least?
>> Is it queued for the 3.18 stable branch?
>
> Sorry about the delay.  Applied to wq/for-4.0-fixes.  Will push out in
> a week or so.

Hello,

I'm getting this during almost every boot this morning, after rebasing
on today's linux-next. Reverting this patch makes the issue go away.
This has been tested on a Tegra 124-based Acer Chromebook 13, running
a Debian derivative (I mention this because I see that in some test
farms the boot succeeded on similar hw, but they probably have a
simpler userspace, eg !systemd).

[    7.358239] Unable to handle kernel NULL pointer dereference at
virtual address 00000000
[    7.368225] pgd = c0204000
[    7.372693] [00000000] *pgd=00000000
[    7.378031] Internal error: Oops: 17 [#1] SMP ARM
[    7.384486] Modules linked in: ipv6
[    7.389738] CPU: 1 PID: 110 Comm: kworker/1:2 Not tainted
4.0.0-rc1-next-20150303ccu #568
[    7.399687] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
[    7.407736] Workqueue: cgroup_destroy css_free_work_fn
[    7.414645] task: ecfe8e40 ti: eb9e0000 task.ti: eb9e0000
[    7.421803] PC is at wake_bit_function+0x18/0x6c
[    7.428168] LR is at __wake_up_common+0x5c/0x90
[    7.434433] pc : [<c028f54c>]    lr : [<c028ed3c>]    psr: 200f0093
[    7.434433] sp : eb9e1df0  ip : eb9e1e08  fp : eb9e1e04
[    7.449379] r10: 00000001  r9 : 00000003  r8 : 00000000
[    7.456331] r7 : 00000000  r6 : ee837a28  r5 : 00000001  r4 : eeda8ee0
[    7.464580] r3 : 00000000  r2 : 00000000  r1 : 00000003  r0 : ebb19df4
[    7.472825] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
Segment kernel
[    7.481952] Control: 10c5387d  Table: abb9406a  DAC: 00000015
[    7.489427] Process kworker/1:2 (pid: 110, stack limit = 0xeb9e0220)
[    7.497524] Stack: (0xeb9e1df0 to 0xeb9e2000)
[    7.503616] 1de0:                                     ee837a1c
00000001 eb9e1e34 eb9e1e08
[    7.513549] 1e00: c028ed3c c028f540 00000000 ee837a24 800f0013
00000000 00000001 00000003
[    7.523490] 1e20: 00000000 00000000 eb9e1e64 eb9e1e38 c028ef88
c028ecec 00000000 c026d274
[    7.533438] 1e40: eb9e1e74 00000011 eb932fb4 00000000 ee837a24
c028f4f0 eb9e1eac eb9e1e68
[    7.543390] 1e60: c026fdfc c028ef4c 600f0013 eb9e1e88 eb9e1e88
eb9e1e74 eb9e1e74 00000006
[    7.553357] 1e80: 00000000 ebaf2a80 eb932f88 eb932f00 eb932f90
c11f5638 00000000 ee7f7005
[    7.563329] 1ea0: eb9e1ebc eb9e1eb0 c026fedc c026fd20 eb9e1ee4
eb9e1ec0 c02ce3bc c026fecc
[    7.573310] 1ec0: eb932f50 ecf81080 c11b0338 ee7f33c0 ee7f7000
00000000 eb9e1f24 eb9e1ee8
[    7.583296] 1ee0: c026f4d0 c02ce254 ee7f33c0 ee7f33d4 eb9e0000
00000000 ecf81080 ee7f33c0
[    7.593289] 1f00: ecf81098 ee7f33d4 eb9e0000 00000008 ecf81080
ee7f33c0 eb9e1f5c eb9e1f28
[    7.603295] 1f20: c026ff6c c026f380 c026ff18 c10ae100 00000000
00000000 eb9c0180 ecf81080
[    7.613305] 1f40: c026ff18 00000000 00000000 00000000 eb9e1fac
eb9e1f60 c02749d8 c026ff24
[    7.623321] 1f60: 00000000 00000000 00000000 ecf81080 00000000
00000000 eb9e1f78 eb9e1f78
[    7.633337] 1f80: 00000000 00000000 eb9e1f88 eb9e1f88 eb9c0180
c02748ec 00000000 00000000
[    7.643330] 1fa0: 00000000 eb9e1fb0 c0210aa0 c02748f8 00000000
00000000 00000000 00000000
[    7.653299] 1fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[    7.663248] 1fe0: 00000000 00000000 00000000 00000000 00000013
00000000 00000000 00000000
[    7.673183] [<c028f54c>] (wake_bit_function) from [<c028ed3c>]
(__wake_up_common+0x5c/0x90)
[    7.683300] [<c028ed3c>] (__wake_up_common) from [<c028ef88>]
(__wake_up+0x48/0x5c)
[    7.692744] [<c028ef88>] (__wake_up) from [<c026fdfc>]
(__cancel_work_timer+0xe8/0x1ac)
[    7.702533] [<c026fdfc>] (__cancel_work_timer) from [<c026fedc>]
(cancel_work_sync+0x1c/0x20)
[    7.712854] [<c026fedc>] (cancel_work_sync) from [<c02ce3bc>]
(css_free_work_fn+0x174/0x2ec)
[    7.723099] [<c02ce3bc>] (css_free_work_fn) from [<c026f4d0>]
(process_one_work+0x15c/0x3dc)
[    7.733339] [<c026f4d0>] (process_one_work) from [<c026ff6c>]
(worker_thread+0x54/0x4e8)
[    7.743224] [<c026ff6c>] (worker_thread) from [<c02749d8>]
(kthread+0xec/0x104)
[    7.752339] [<c02749d8>] (kthread) from [<c0210aa0>]
(ret_from_fork+0x14/0x34)
[    7.761366] Code: e24cb004 e52de004 e8bd4000 e510400c (e5935000)
[    7.769273] ---[ end trace f25fc65c3d66034c ]---
[    7.778675] Unable to handle kernel paging request at virtual
address ffffffec
[    7.787735] pgd = c0204000
[    7.792274] [ffffffec] *pgd=af7fd821, *pte=00000000, *ppte=00000000
[    7.800424] Internal error: Oops: 17 [#2] SMP ARM
[    7.806970] Modules linked in: ipv6
[    7.812307] CPU: 1 PID: 110 Comm: kworker/1:2 Tainted: G      D
    4.0.0-rc1-next-20150303ccu #568
[    7.823549] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
[    7.831683] task: ecfe8e40 ti: eb9e0000 task.ti: eb9e0000
[    7.838946] PC is at kthread_data+0x18/0x20
[    7.844997] LR is at wq_worker_sleeping+0x1c/0xe0
[    7.851562] pc : [<c02750a4>]    lr : [<c02704ac>]    psr: 00070093
[    7.851562] sp : eb9e1b38  ip : eb9e1b48  fp : eb9e1b44
[    7.866774] r10: 2d74a000  r9 : ecfe90d4  r8 : c10aedd4
[    7.873862] r7 : c10a9840  r6 : c10a9840  r5 : ecfe8e40  r4 : 00000001
[    7.882253] r3 : 00000000  r2 : 00000000  r1 : 00000001  r0 : ecfe8e40
[    7.890642] Flags: nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
Segment user
[    7.899735] Control: 10c5387d  Table: ab82806a  DAC: 00000015
[    7.907359] Process kworker/1:2 (pid: 110, stack limit = 0xeb9e0220)
[    7.915604] Stack: (0xeb9e1b38 to 0xeb9e2000)
[    7.921836] 1b20:
    eb9e1b5c eb9e1b48
[    7.931894] 1b40: c02704ac c0275098 ee7f3840 ecfe8e40 eb9e1ba4
eb9e1b60 c0ad3a64 c027049c
[    7.941971] 1b60: eb9e1bbc eb9e1b70 c025a2cc c02a81ac 00000000
00000001 ecfe6e08 eb9e0000
[    7.952061] 1b80: eb9e199c eb9e1bc8 ecfe9050 00000001 c028f550
ec920000 eb9e1bbc eb9e1ba8
[    7.962152] 1ba0: c0ad3cfc c0ad3700 0420806c ecfe8e40 eb9e1bfc
eb9e1bc0 c025a9b8 c0ad3cbc
[    7.972247] 1bc0: eb9e1bec c10f0ffc eb9e1bc8 eb9e1bc8 eb9e1da8
c11ca184 c10b32e4 eb9e1da8
[    7.982331] 1be0: 600f0193 0000000b c028f550 00000001 eb9e1c84
eb9e1c00 c0215058 c025a368
[    7.992414] 1c00: eb9e0220 0000000b c0d7bc6c c0d7bc64 00000008
00000000 00000000 c10b32e4
[    8.002501] 1c20: 6529b270 62633432 20343030 64323565 34303065
62386520 30303464 35652030
[    8.012600] 1c40: 30343031 28206330 33393565 30303035 c0002029
c0ad17e4 c0e54bcc 00000000
[    8.022715] 1c60: 00000017 eb9e1da8 00000000 00000000 00000003
eb9e1da8 eb9e1c9c eb9e1c88
[    8.032832] 1c80: c0ad0df4 c0214c08 eb9e1da8 ecfe8e40 eb9e1cf4
eb9e1ca0 c0221408 c0ad0d8c
[    8.042955] 1ca0: ecfe8e40 c10aedd4 ec9ab440 ecfe8e88 ee7f3840
c0285c50 ee7f3880 ec9ab490
[    8.053090] 1cc0: eb9e1ce4 eb9e1cd0 c0284f88 c10b3864 00000017
c0221190 00000000 eb9e1da8
[    8.063238] 1ce0: 00000003 00000001 eb9e1da4 eb9e1cf8 c02091e8
c022119c c10a9840 c0275894
[    8.073402] 1d00: ecb5801c c0275894 eb9e1d3c eb9e1d18 c0275894
c0219c3c ecb5801c ecfe8e40
[    8.083560] 1d20: ec9ab440 00000000 eb81ca80 c0ad3904 ee7f3840
ecfe8e40 ec9ab440 00000000
[    8.093711] 1d40: eb81ca80 ecfe90d0 eb9e1d9c eb9e1d58 c0ad3904
c027b70c 00000000 000003ff
[    8.103866] 1d60: 00000000 00000001 2d74a000 ee7f3840 00000000
eb9e0000 eb9e0000 eb9e1e84
[    8.114026] 1d80: 00000002 c028f54c 200f0093 ffffffff eb9e1ddc
00000000 eb9e1e04 eb9e1da8
[    8.124175] 1da0: c02157d8 c02091ac ebb19df4 00000003 00000000
00000000 eeda8ee0 00000001
[    8.134322] 1dc0: ee837a28 00000000 00000000 00000003 00000001
eb9e1e04 eb9e1e08 eb9e1df0
[    8.144464] 1de0: c028ed3c c028f54c 200f0093 ffffffff ee837a1c
00000001 eb9e1e34 eb9e1e08
[    8.154608] 1e00: c028ed3c c028f540 00000000 ee837a24 800f0013
00000000 00000001 00000003
[    8.164744] 1e20: 00000000 00000000 eb9e1e64 eb9e1e38 c028ef88
c028ecec 00000000 c026d274
[    8.174879] 1e40: eb9e1e74 00000011 eb932fb4 00000000 ee837a24
c028f4f0 eb9e1eac eb9e1e68
[    8.185012] 1e60: c026fdfc c028ef4c 600f0013 eb9e1e88 eb9e1e88
eb9e1e74 eb9e1e74 00000006
[    8.195145] 1e80: 00000000 ebaf2a80 eb932f88 eb932f00 eb932f90
c11f5638 00000000 ee7f7005
[    8.205283] 1ea0: eb9e1ebc eb9e1eb0 c026fedc c026fd20 eb9e1ee4
eb9e1ec0 c02ce3bc c026fecc
[    8.215423] 1ec0: eb932f50 ecf81080 c11b0338 ee7f33c0 ee7f7000
00000000 eb9e1f24 eb9e1ee8
[    8.225563] 1ee0: c026f4d0 c02ce254 ee7f33c0 ee7f33d4 eb9e0000
00000000 ecf81080 ee7f33c0
[    8.235696] 1f00: ecf81098 ee7f33d4 eb9e0000 00000008 ecf81080
ee7f33c0 eb9e1f5c eb9e1f28
[    8.245837] 1f20: c026ff6c c026f380 c026ff18 c10ae100 00000000
00000000 eb9c0180 ecf81080
[    8.255986] 1f40: c026ff18 00000000 00000000 00000000 eb9e1fac
eb9e1f60 c02749d8 c026ff24
[    8.266143] 1f60: 00000000 00000000 00000000 ecf81080 00000000
00000000 eb9e1f78 eb9e1f78
[    8.276310] 1f80: 00000001 00010001 eb9e1f88 eb9e1f88 eb9c0180
c02748ec 00000000 00000000
[    8.286498] 1fa0: 00000000 eb9e1fb0 c0210aa0 c02748f8 00000000
00000000 00000000 00000000
[    8.296672] 1fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[    8.306828] 1fe0: 00000000 00000000 00000000 00000000 00000013
00000000 00000000 00000000
[    8.316967] [<c02750a4>] (kthread_data) from [<c02704ac>]
(wq_worker_sleeping+0x1c/0xe0)
[    8.327023] [<c02704ac>] (wq_worker_sleeping) from [<c0ad3a64>]
(__schedule+0x370/0x5bc)
[    8.337072] [<c0ad3a64>] (__schedule) from [<c0ad3cfc>] (schedule+0x4c/0xa4)
[    8.346085] [<c0ad3cfc>] (schedule) from [<c025a9b8>] (do_exit+0x65c/0x960)
[    8.355013] [<c025a9b8>] (do_exit) from [<c0215058>] (die+0x45c/0x474)
[    8.363498] [<c0215058>] (die) from [<c0ad0df4>]
(__do_kernel_fault.part.10+0x74/0x84)
[    8.373386] [<c0ad0df4>] (__do_kernel_fault.part.10) from
[<c0221408>] (do_page_fault+0x278/0x388)
[    8.384322] [<c0221408>] (do_page_fault) from [<c02091e8>]
(do_DataAbort+0x48/0xa8)
[    8.393960] [<c02091e8>] (do_DataAbort) from [<c02157d8>]
(__dabt_svc+0x38/0x60)
[    8.403332] Exception stack(0xeb9e1da8 to 0xeb9e1df0)
[    8.410356] 1da0:                   ebb19df4 00000003 00000000
00000000 eeda8ee0 00000001
[    8.420519] 1dc0: ee837a28 00000000 00000000 00000003 00000001
eb9e1e04 eb9e1e08 eb9e1df0
[    8.430686] 1de0: c028ed3c c028f54c 200f0093 ffffffff
[    8.437733] [<c02157d8>] (__dabt_svc) from [<c028f54c>]
(wake_bit_function+0x18/0x6c)
[    8.447569] [<c028f54c>] (wake_bit_function) from [<c028ed3c>]
(__wake_up_common+0x5c/0x90)
[    8.457917] [<c028ed3c>] (__wake_up_common) from [<c028ef88>]
(__wake_up+0x48/0x5c)
[    8.467579] [<c028ef88>] (__wake_up) from [<c026fdfc>]
(__cancel_work_timer+0xe8/0x1ac)
[    8.477591] [<c026fdfc>] (__cancel_work_timer) from [<c026fedc>]
(cancel_work_sync+0x1c/0x20)
[    8.488145] [<c026fedc>] (cancel_work_sync) from [<c02ce3bc>]
(css_free_work_fn+0x174/0x2ec)
[    8.498620] [<c02ce3bc>] (css_free_work_fn) from [<c026f4d0>]
(process_one_work+0x15c/0x3dc)
[    8.509093] [<c026f4d0>] (process_one_work) from [<c026ff6c>]
(worker_thread+0x54/0x4e8)
[    8.519227] [<c026ff6c>] (worker_thread) from [<c02749d8>]
(kthread+0xec/0x104)
[    8.528606] [<c02749d8>] (kthread) from [<c0210aa0>]
(ret_from_fork+0x14/0x34)
[    8.537899] Code: e24cb004 e52de004 e8bd4000 e5903268 (e5130014)
[    8.546045] ---[ end trace f25fc65c3d66034d ]---
[    8.552705] Fixing recursive fault but reboot is needed!

Regards,

Tomeu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ