lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170227183925.GA8707@htj.duckdns.org>
Date:   Mon, 27 Feb 2017 13:39:25 -0500
From:   Tejun Heo <tj@...nel.org>
To:     Dave Jones <davej@...emonkey.org.uk>
Cc:     Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: __queue_work oops.

Hello,

On Mon, Feb 27, 2017 at 12:14:39PM -0500, Dave Jones wrote:
> Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.10.0-think+ #9 
> task: ffff88017f105440 task.stack: ffffc90000094000
> RIP: 0010:__queue_work+0x2d/0x700
> RSP: 0018:ffff880507c03df8 EFLAGS: 00010046
> RAX: 0000000000000082 RBX: 0000000000000101 RCX: 0000000000000002
> RDX: ffff88047bf07c98 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: ffff880507c03e30 R08: 0000000000000001 R09: ffffffff8294bf68
> R10: ffff880507c03e58 R11: 0000000000000000 R12: ffff88047bf07ce8
> R13: 0000000000000000 R14: 0000000000000000 R15: ffff88047bf07c98
> FS:  0000000000000000(0000) GS:ffff880507c00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000001c2 CR3: 0000000004e11000 CR4: 00000000001406e0
> Call Trace:
>  <IRQ>
>  ? work_on_cpu+0xb0/0xb0
>  delayed_work_timer_fn+0x1e/0x20
>  call_timer_fn+0xbd/0x480
...
> Code starting with the faulting instruction
> ===========================================
>    0:	41 f6 85 c2 01 00 00 	testb  $0x1,0x1c2(%r13)
>    7:	01 
>    8:	0f 85 22 04 00 00    	jne    0x430
>    e:	49                   	rex.WB
>    f:	bc eb 83 b5 80       	mov    $0x80b583eb,%esp
>   14:	46                   	rex.RX
> 
> 0000000000003cf0 <__queue_work>:
> {
>     3cf0:       e8 00 00 00 00          callq  3cf5 <__queue_work+0x5>
>     3cf5:       55                      push   %rbp
>     3cf6:       48 89 e5                mov    %rsp,%rbp
>     3cf9:       41 57                   push   %r15
>     3cfb:       49 89 d7                mov    %rdx,%r15
>     3cfe:       41 56                   push   %r14
>         unsigned int req_cpu = cpu;
>     3d00:       41 89 fe                mov    %edi,%r14d
> {
>     3d03:       41 55                   push   %r13
>     3d05:       49 89 f5                mov    %rsi,%r13
>     3d08:       41 54                   push   %r12
>     3d0a:       53                      push   %rbx
>     3d0b:       48 83 ec 10             sub    $0x10,%rsp
>     3d0f:       89 7d d4                mov    %edi,-0x2c(%rbp)
>         asm volatile("# __raw_save_flags\n\t"
>     3d12:       9c                      pushfq 
>     3d13:       58                      pop    %rax
>         WARN_ON_ONCE(!irqs_disabled());
>     3d14:       f6 c4 02                test   $0x2,%ah
>     3d17:       0f 85 06 04 00 00       jne    4123 <__queue_work+0x433>
>         if (unlikely(wq->flags & __WQ_DRAINING) &&
>     3d1d:       41 f6 85 c2 01 00 00    testb  $0x1,0x1c2(%r13)
> 
> 
> So we called __queue_work with a null wq.

So, that's somebody calling queue_delayed_work[_on]() with a NULL wq
and when the timeout expires the timer callback trying to queue
against NULL.  Hmm... the work function would be able to tell us who
queued it but it isn't part of the information dumped here (would be
0x18(%rdx)).

I'll add a sanity check on queue_delayed_work_on() so that we can
catch it synchronously when it happens.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ