linux-kernel - Re: perf: perf_fuzzer lockup in perf_cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBQZTFuMKYX62PXjdHtZRKxKkj=emQA3SepAidLj6tAWVg@mail.gmail.com>
Date:   Wed, 14 Sep 2016 22:35:49 -0700
From:   Stephane Eranian <eranian@...gle.com>
To:     Vince Weaver <vincent.weaver@...ne.edu>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>
Subject: Re: perf: perf_fuzzer lockup in perf_cgroup_attach

On Wed, Sep 14, 2016 at 7:43 PM, Vince Weaver <vincent.weaver@...ne.edu> wrote:
>
> so the skylake that was fuzzing finally is mostly locked up.
>
> Really hard to tell what's going, especially as KASLR made looking up the
> addresses a big pain.
>
I would think there is a way to disable KASLR for this kind of testing!

Which of your fuzzer scripts are you using? fast_repro99.sh?

> The best I can tell things are getting wedged somehow in
> perf_cgroup_switch() while interrupts are disabled.  Interrupts are never
> getting re-enabled, causing the RCU and NMI watchdogs to trigger (and more
> alarming things like the SATA bus resetting).
>
How do you get to perf_cgroup_switch() from the traces you provide below?

> [26292.413603] Task dump for CPU 4:
> [26292.413604] perf_fuzzer     R  running task        0  8870   1096 0x10000008
> [26292.413605]  ffff9045f29e1100 00000000e7143ab0 ffff9045fdd03db8 ffffffff938accef
> [26292.413606]  0000000000000004 0000000000000087 ffff9045fdd03dd0 ffffffff938af5e9
> [26292.413607]  0000000000000004 ffff9045fdd03e00 ffffffff93984928 ffff9045fdd19440
> [26292.413608] Call Trace:
> [26292.413609]  <IRQ>  [<ffffffff938accef>] sched_show_task+0xaf/0x110
> [26292.413611]  [<ffffffff938af5e9>] dump_cpu_task+0x39/0x40
> [26292.413613]  [<ffffffff93984928>] rcu_dump_cpu_stacks+0x80/0xbb
> [26292.413614]  [<ffffffff938e78ce>] rcu_check_callbacks+0x71e/0x880
> [26292.413615]  [<ffffffff9393364c>] ? acct_account_cputime+0x1c/0x20
> [26292.413616]  [<ffffffff938b0049>] ? account_system_time+0x79/0x120
> [26292.413617]  [<ffffffff938fe8b0>] ? tick_sched_do_timer+0x30/0x30
> [26292.413619]  [<ffffffff938ee88f>] update_process_times+0x2f/0x60
> [26292.413619]  [<ffffffff938fe285>] tick_sched_handle.isra.13+0x25/0x60
> [26292.413620]  [<ffffffff938fe8ed>] tick_sched_timer+0x3d/0x70
> [26292.413621]  [<ffffffff938ef633>] __hrtimer_run_queues+0xf3/0x280
> [26292.413623]  [<ffffffff938efb08>] hrtimer_interrupt+0xa8/0x1a0
> [26292.413624]  [<ffffffff93852d88>] local_apic_timer_interrupt+0x38/0x60
> [26292.413625]  [<ffffffff93e1dee4>] smp_trace_apic_timer_interrupt+0x44/0xde
> [26292.413627]  [<ffffffff9396ff60>] ? perf_cgroup_attach+0x70/0x70
> [26292.413628]  [<ffffffff93e1d232>] trace_apic_timer_interrupt+0x82/0x90
> [26292.413628]  <EOI>  [<ffffffff9396ff60>] ? perf_cgroup_attach+0x70/0x70
> [26292.413631]  [<ffffffff939039d6>] ? smp_call_function_single+0xd6/0x130
> [26292.413632]  [<ffffffff9396ff60>] ? perf_cgroup_attach+0x70/0x70
> [26292.413633]  [<ffffffff9396ec53>] cpu_function_call+0x43/0x60
> [26292.413634]  [<ffffffff93976840>] ? __perf_event_enable+0x260/0x260
> [26292.413635]  [<ffffffff939737e1>] perf_install_in_context+0x141/0x150
> [26292.413636]  [<ffffffff9397956e>] SYSC_perf_event_open+0x70e/0xfe0
> [26292.413637]  [<ffffffff938a99e4>] ? check_preempt_curr+0x54/0x90
> [26292.413639]  [<ffffffff9397c099>] SyS_perf_event_open+0x9/0x10
> [26292.413640]  [<ffffffff93803bb4>] do_syscall_64+0x64/0x160
> [26292.413641]  [<ffffffff93e1b625>] entry_SYSCALL64_slow_path+0x25/0x25
>
> [26316.489382] NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [perf_fuzzer:8870]
> [26316.497309] CPU: 4 PID: 8870 Comm: perf_fuzzer Tainted: G        W       4.8.0-rc6+ #5
> [26316.497310] Hardware name: LENOVO 10FY0017US/SKYBAY, BIOS FWKT53A   06/06/2016
> [26316.497310] task: ffff9045f29e1100 task.stack: ffff9045f0a48000
> [26316.497311] RIP: 0010:[<ffffffff939039d6>]  [<ffffffff939039d6>] smp_call_function_single+0xd6/0x130
> [26316.497312] RSP: 0018:ffff9045f0a4bd30  EFLAGS: 00000202
> [26316.497313] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
> [26316.497313] RDX: 0000000000000001 RSI: 00000000000000fb RDI: 0000000000000286
> [26316.497314] RBP: ffff9045f0a4bd78 R08: ffff9045fdc9bed0 R09: ffff9045f2ac8000
> [26316.497314] R10: 0000000000000000 R11: 0000000000000246 R12: ffffffff9396ff60
> [26316.497315] R13: ffff9045fdc9bed0 R14: 000000000000001e R15: ffff9045fdc9bee0
> [26316.497315] FS:  00007f4157a08700(0000) GS:ffff9045fdd00000(0000) knlGS:0000000000000000
> [26316.497316] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [26316.497316] CR2: 000000000693a048 CR3: 0000000231049000 CR4: 00000000003407e0
> [26316.497317] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [26316.497317] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> [26316.497318] Stack:
> [26316.497318]  0000000000000000 ffff9045ee35da80 0000000000000000 ffffffff9396ff60
> [26316.497319]  ffff9045f0a4bd88 0000000000000003 00000000e7143ab0 ffff9045ee47613a
> [26316.497321]  ffff9045ee476000 ffff9045f0a4bdb0 ffffffff9396ec53 0000000000000000
> [26316.497322] Call Trace:
> [26316.497323]  [<ffffffff9396ff60>] ? perf_cgroup_attach+0x70/0x70
> [26316.497324]  [<ffffffff9396ec53>] cpu_function_call+0x43/0x60
> [26316.497325]  [<ffffffff93976840>] ? __perf_event_enable+0x260/0x260
> [26316.497326]  [<ffffffff939737e1>] perf_install_in_context+0x141/0x150
> [26316.497327]  [<ffffffff9397956e>] SYSC_perf_event_open+0x70e/0xfe0
> [26316.497328]  [<ffffffff938a99e4>] ? check_preempt_curr+0x54/0x90
> [26316.497330]  [<ffffffff9397c099>] SyS_perf_event_open+0x9/0x10
> [26316.497331]  [<ffffffff93803bb4>] do_syscall_64+0x64/0x160
> [26316.497332]  [<ffffffff93e1b625>] entry_SYSCALL64_slow_path+0x25/0x25
> [26316.497332] Code: 25 28 00 00 00 75 70 48 83 c4 38 5b 41 5c 5d c3 48 8d 75 c8 48 89 d1 89 df 4c 89 e2 e8 14 fe ff ff 8b 55 e0 83 e2 01 74 0a f3 90 <8b> 55 e0 83 e2 01 75 f6 eb c3 8b 05 ba e5 dd 00 85 c0 75 85 80
>