linux-kernel - Re: perf: another perf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBStVPW_W29-6Q0fcNeHPUmRWow3ckT--0rCr-5EM07wow@mail.gmail.com>
Date:	Fri, 8 May 2015 17:04:29 -0700
From:	Stephane Eranian <eranian@...gle.com>
To:	Vince Weaver <vincent.weaver@...ne.edu>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Jiri Olsa <jolsa@...hat.com>, Ingo Molnar <mingo@...hat.com>,
	Paul Mackerras <paulus@...ba.org>
Subject: Re: perf: another perf_fuzzer generated lockup

Vince,

On Thu, May 7, 2015 at 9:40 PM, Vince Weaver <vincent.weaver@...ne.edu> wrote:
>
>
> This is a new one I think, I hit it on the haswell machine running
> 4.1-rc2.
>
> The backtrace is complex enough I'm not really sure what's going on here.
>
> The fuzzer has been having weird issues where it's been getting
> overflow signals from invalid fds.  This seems to happen
> when an overflow signal interrupts the fuzzer mid-fork?
> And the fuzzer code doesn't handle this well and attempts to call exit()
> and/or kill the child from the signal handler that interrupted the
> fork() and that doesn't always go well.  I'm not sure if this is related,
> just that some of those actions seem to appear in the backtrace.
>
>
Is there a way to figure out how the fuzzer had programmed the PMU
to get there? (besides adding PMU state dump in the kernel crashdump)?


>
> [33864.529861] ------------[ cut here ]------------
> [33864.534824] WARNING: CPU: 1 PID: 9852 at kernel/watchdog.c:302 watchdog_overflow_callback+0x92/0xc0()
> [33864.544682] Watchdog detected hard LOCKUP on cpu 1
> [33864.549635] Modules linked in:
> [33864.552943]  fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic kvm snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core crct10dif_pclmul snd_hwdep crc32_pclmul ghash_clmulni_intel snd_pcm aesni_intel aes_x86_64 lrw gf128mul evdev i915 iTCO_wdt iTCO_vendor_support glue_helper snd_timer ppdev ablk_helper psmouse drm_kms_helper cryptd snd drm pcspkr serio_raw lpc_ich soundcore parport_pc xhci_pci battery video processor i2c_i801 mei_me mei wmi i2c_algo_bit tpm_tis mfd_core xhci_hcd tpm parport button sg sr_mod sd_mod cdrom ehci_pci ahci ehci_hcd libahci libata e1000e ptp usbcore crc32c_intel scsi_mod fan usb_common pps_core thermal thermal_sys
> [33864.622413] CPU: 1 PID: 9852 Comm: perf_fuzzer Not tainted 4.1.0-rc2+ #142
> [33864.629776] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
> [33864.637685]  ffffffff81a209b5 ffff88011ea45aa0 ffffffff816d51d3 0000000000000000
> [33864.645709]  ffff88011ea45af0 ffff88011ea45ae0 ffffffff81072dfa ffff88011ea45ac0
> [33864.653731]  ffff880119b8f800 0000000000000000 ffff88011ea45c40 ffff88011ea45ef8
> [33864.661783] Call Trace:
> [33864.664409]  <NMI>  [<ffffffff816d51d3>] dump_stack+0x45/0x57
> [33864.670618]  [<ffffffff81072dfa>] warn_slowpath_common+0x8a/0xc0
> [33864.677071]  [<ffffffff81072e76>] warn_slowpath_fmt+0x46/0x50
> [33864.683202]  [<ffffffff8104db64>] ? native_apic_wait_icr_idle+0x24/0x30
> [33864.690280]  [<ffffffff81124f22>] watchdog_overflow_callback+0x92/0xc0
> [33864.697294]  [<ffffffff811673b1>] __perf_event_overflow+0x91/0x270
> [33864.703916]  [<ffffffff811673f9>] ? __perf_event_overflow+0xd9/0x270
> [33864.710696]  [<ffffffff8102b60a>] ? x86_perf_event_set_period+0xda/0x180
> [33864.717842]  [<ffffffff81167f89>] perf_event_overflow+0x19/0x20
> [33864.724195]  [<ffffffff810338f2>] intel_pmu_handle_irq+0x1e2/0x450
> [33864.730840]  [<ffffffff8102a41b>] perf_event_nmi_handler+0x2b/0x50
> [33864.737436]  [<ffffffff810185d0>] nmi_handle+0xa0/0x150
> [33864.743025]  [<ffffffff81018535>] ? nmi_handle+0x5/0x150
> [33864.748733]  [<ffffffff8101887a>] default_do_nmi+0x4a/0x140
> [33864.754705]  [<ffffffff81018a08>] do_nmi+0x98/0xe0
> [33864.759858]  [<ffffffff816df42f>] end_repeat_nmi+0x1e/0x2e
> [33864.765746]  [<ffffffff810bc1bb>] ? check_chain_key+0xdb/0x1e0
> [33864.772004]  [<ffffffff810bc1bb>] ? check_chain_key+0xdb/0x1e0
> [33864.778253]  [<ffffffff810bc1bb>] ? check_chain_key+0xdb/0x1e0
> [33864.784498]  <<EOE>>  <IRQ>  [<ffffffff810bdd09>] __lock_acquire.isra.31+0x3b9/0x1000
> [33864.792950]  [<ffffffff810bdd09>] ? __lock_acquire.isra.31+0x3b9/0x1000
> [33864.800045]  [<ffffffff810bf2e5>] lock_acquire+0xa5/0x130
> [33864.805817]  [<ffffffff81080afe>] ? __lock_task_sighand+0x6e/0x110
> [33864.812468]  [<ffffffff81080aaa>] ? __lock_task_sighand+0x1a/0x110
> [33864.819084]  [<ffffffff816dc721>] _raw_spin_lock+0x31/0x40
> [33864.824979]  [<ffffffff81080afe>] ? __lock_task_sighand+0x6e/0x110
> [33864.831623]  [<ffffffff81080afe>] __lock_task_sighand+0x6e/0x110
> [33864.838096]  [<ffffffff81080aaa>] ? __lock_task_sighand+0x1a/0x110
> [33864.845314]  [<ffffffff81080bcc>] do_send_sig_info+0x2c/0x80
> [33864.851949]  [<ffffffff81167807>] ? perf_swevent_event+0x67/0x90
> [33864.858980]  [<ffffffff8120092f>] send_sigio_to_task+0x12f/0x1a0
> [33864.866005]  [<ffffffff81200805>] ? send_sigio_to_task+0x5/0x1a0
> [33864.873047]  [<ffffffff812010c6>] ? send_sigio+0x56/0x100
> [33864.879411]  [<ffffffff8120111e>] send_sigio+0xae/0x100
> [33864.885564]  [<ffffffff81201207>] kill_fasync+0x97/0xf0
> [33864.891713]  [<ffffffff8120117f>] ? kill_fasync+0xf/0xf0
> [33864.897983]  [<ffffffff811667d4>] perf_event_wakeup+0xd4/0xf0
> [33864.904662]  [<ffffffff81166705>] ? perf_event_wakeup+0x5/0xf0
> [33864.911490]  [<ffffffff811668d0>] ? perf_pending_event+0xe0/0x110
> [33864.918580]  [<ffffffff811668d0>] perf_pending_event+0xe0/0x110
> [33864.925494]  [<ffffffff81156eac>] irq_work_run_list+0x4c/0x80
> [33864.932197]  [<ffffffff81156ef8>] irq_work_run+0x18/0x40
> [33864.938469]  [<ffffffff8101953f>] smp_trace_irq_work_interrupt+0x3f/0xc0
> [33864.946263]  [<ffffffff816dea3e>] trace_irq_work_interrupt+0x6e/0x80
> [33864.953646]  <EOI>  [<ffffffff811a1b47>] ? copy_page_range+0x527/0x9a0
> [33864.961287]  [<ffffffff811a1b22>] ? copy_page_range+0x502/0x9a0
> [33864.968265]  [<ffffffff810710d2>] copy_process.part.23+0xc92/0x1b80
> [33864.975589]  [<ffffffff810813be>] ? SYSC_kill+0x8e/0x230
> [33864.981879]  [<ffffffff81072178>] do_fork+0xd8/0x420
> [33864.987807]  [<ffffffff812007e3>] ? f_setown+0x83/0xa0
> [33864.993953]  [<ffffffff81200d30>] ? SyS_fcntl+0x310/0x650
> [33865.000348]  [<ffffffff81417a23>] ? lockdep_sys_exit_thunk+0x12/0x14
> [33865.007781]  [<ffffffff81072546>] SyS_clone+0x16/0x20
> [33865.013830]  [<ffffffff816dd1b2>] system_call_fastpath+0x16/0x7a
> [33865.020843] ---[ end trace d3bd7d73656f3cba ]---
> [33865.026418] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 496.487 msecs
> [33865.035874] perf interrupt took too long (3879951 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/