linux-kernel - proc_flush

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171218214438.GA32728@codemonkey.org.uk>
Date:   Mon, 18 Dec 2017 16:44:38 -0500
From:   Dave Jones <davej@...emonkey.org.uk>
To:     Linux Kernel <linux-kernel@...r.kernel.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Al Viro <viro@...iv.linux.org.uk>
Subject: proc_flush_task oops

I've hit this twice today. It's odd, because afaics, none of this code
has really changed in a long time.

	Dave

Oops: 0000 [#1] SMP
CPU: 2 PID: 6743 Comm: trinity-c117 Not tainted 4.15.0-rc4-think+ #2 
RIP: 0010:proc_flush_task+0x8e/0x1b0
RSP: 0018:ffffc9000bbffc40 EFLAGS: 00010286
RAX: 0000000000000001 RBX: 0000000000000001 RCX: 00000000fffffffb
RDX: 0000000000000000 RSI: ffffc9000bbffc50 RDI: 0000000000000000
RBP: ffffc9000bbffc63 R08: 0000000000000000 R09: 0000000000000002
R10: ffffc9000bbffb70 R11: ffffc9000bbffc64 R12: 0000000000000003
R13: 0000000000000000 R14: 0000000000000003 R15: ffff8804c10d7840
FS:  00007f7cb8965700(0000) GS:ffff88050a200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000003e21ae003 CR4: 00000000001606e0
DR0: 00007fb1d6c22000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Call Trace:
 ? release_task+0xaf/0x680
 release_task+0xd2/0x680
 ? wait_consider_task+0xb82/0xce0
 wait_consider_task+0xbe9/0xce0
 ? do_wait+0xe1/0x330
 do_wait+0x151/0x330
 kernel_wait4+0x8d/0x150
 ? task_stopped_code+0x50/0x50
 SYSC_wait4+0x95/0xa0
 ? rcu_read_lock_sched_held+0x6c/0x80
 ? syscall_trace_enter+0x2d7/0x340
 ? do_syscall_64+0x60/0x210
 do_syscall_64+0x60/0x210
 entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7f7cb82603aa
RSP: 002b:00007ffd60770bc8 EFLAGS: 00000246
 ORIG_RAX: 000000000000003d
RAX: ffffffffffffffda RBX: 00007f7cb6cd4000 RCX: 00007f7cb82603aa
RDX: 000000000000000b RSI: 00007ffd60770bd0 RDI: 0000000000007cca
RBP: 0000000000007cca R08: 00007f7cb8965700 R09: 00007ffd607c7080
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffd60770bd0 R14: 00007f7cb6cd4058 R15: 00000000cccccccd
Code: c1 e2 04 44 8b 60 30 48 8b 40 38 44 8b 34 11 48 c7 c2 60 3a f5 81 44 89 e1 4c 8b 68 58 e8 4b b4 77 00 89 44 24 14 48 8d 74 24 10 <49> 8b 7d 00 e8 b9 6a f9 ff 48 85 c0 74 1a 48 89 c7 48 89 44 24 
RIP: proc_flush_task+0x8e/0x1b0 RSP: ffffc9000bbffc40
CR2: 0000000000000000
---[ end trace 53d67a6481059862 ]---

All code
========
   0:	c1 e2 04             	shl    $0x4,%edx
   3:	44 8b 60 30          	mov    0x30(%rax),%r12d
   7:	48 8b 40 38          	mov    0x38(%rax),%rax
   b:	44 8b 34 11          	mov    (%rcx,%rdx,1),%r14d
   f:	48 c7 c2 60 3a f5 81 	mov    $0xffffffff81f53a60,%rdx
  16:	44 89 e1             	mov    %r12d,%ecx
  19:	4c 8b 68 58          	mov    0x58(%rax),%r13
  1d:	e8 4b b4 77 00       	callq  0x77b46d
  22:	89 44 24 14          	mov    %eax,0x14(%rsp)
  26:	48 8d 74 24 10       	lea    0x10(%rsp),%rsi
  2b:*	49 8b 7d 00          	mov    0x0(%r13),%rdi     <-- trapping instruction
  2f:	e8 b9 6a f9 ff       	callq  0xfffffffffff96aed
  34:	48 85 c0             	test   %rax,%rax
  37:	74 1a                	je     0x53
  39:	48 89 c7             	mov    %rax,%rdi
  3c:	48                   	rex.W
  3d:	89                   	.byte 0x89
  3e:	44                   	rex.R
  3f:	24                   	.byte 0x24

Code starting with the faulting instruction
===========================================
   0:	49 8b 7d 00          	mov    0x0(%r13),%rdi
   4:	e8 b9 6a f9 ff       	callq  0xfffffffffff96ac2
   9:	48 85 c0             	test   %rax,%rax
   c:	74 1a                	je     0x28
   e:	48 89 c7             	mov    %rax,%rdi
  11:	48                   	rex.W
  12:	89                   	.byte 0x89
  13:	44                   	rex.R
  14:	24                   	.byte 0x24



This looks like an inlined part of proc_flush_task_mnt

        dentry = d_hash_and_lookup(mnt->mnt_root, &name);
    4f99:       48 8d 74 24 10          lea    0x10(%rsp),%rsi
    4f9e:       49 8b 7d 00             mov    0x0(%r13),%rdi
    4fa2:       e8 00 00 00 00          callq  4fa7 <proc_flush_task+0x97>

So it looks like this..

3097         for (i = 0; i <= pid->level; i++) {
3098                 upid = &pid->numbers[i];
3099                 proc_flush_task_mnt(upid->ns->proc_mnt, upid->nr,
3100                                         tgid->numbers[i].nr);
3101         }

somehow passed a null upid->ns->proc_mnt down there.

I'll try and narrow down a reproducer tomorrow.

Any obvious recent changes that might explain this, or did I just
finally appease the entropy gods enough to find the right combination of
args to hit this ?

	Dave