lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 27 Jul 2011 14:51:47 -0400 (EDT)
From:	Vince Weaver <vweaver1@...s.utk.edu>
To:	linux-kernel@...r.kernel.org
cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Paul Mackerras <paulus@...ba.org>, Ingo Molnar <mingo@...e.hu>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: Re: [perf] overflow/perf_count_sw_cpu_clock crashes recent kernels

Hello

> With 3.0.0 the PAPI "overflow_allcounters" test reliably locks up my 
> Nehalem system.

I finally managed to narrow this down to a small test, which is attached.

Basically measuring overflow on the perf::perf_count_sw_cpu_clock
event will potentially *lock up* your system from user-space.

This seems to be a long standing bug.  It will quickly lock solid
my Nehalem test box on 3.0, 2.6.39 and 2.6.38.

On a Core2 2.6.32 box the crash testing program will wedge and become 
unkillable, but it doesn't actually kill the machine.

As mentioned before, on the Nehalem machine the following warning happens
before the machine becomes unusable:

[  392.504845] ------------[ cut here ]------------
[  392.504962] WARNING: at kernel/smp.c:320 smp_call_function_single+0x6c/0xf2()
[  392.505074] Hardware name: Precision M4500
[  392.505181] Modules linked in: acpi_cpufreq cpufreq_conservative mperf cpufreq_powersave cpufreq_userspace cpufreq_stats uinput nouveau snd_hda_codec_hdmi ttm drm_kms_helper mxm_wmi snd_hda_codec_idt iwlagn mac80211 snd_hda_intel snd_hda_codec cfg80211 dell_laptop snd_hwdep video processor ehci_hcd dell_wmi sparse_keymap psmouse sdhci_pci rfkill snd_pcm sdhci thermal_sys pcspkr ac battery wmi serio_raw snd_timer snd_page_alloc evdev i2c_i801 dcdbas button
[  392.509709] Pid: 2310, comm: overflow_allcou Not tainted 3.0.0 #43
[  392.509819] Call Trace:
[  392.509925]  <IRQ>  [<ffffffff81041bb0>] ? warn_slowpath_common+0x78/0x8c
[  392.510144]  [<ffffffff810a4d19>] ? perf_exclude_event.part.23+0x31/0x31
[  392.510257]  [<ffffffff8106b7c5>] ? smp_call_function_single+0x6c/0xf2
[  392.510369]  [<ffffffff810a38aa>] ? task_function_call+0x42/0x4c
[  392.510476]  [<ffffffff810a50e4>] ? update_cgrp_time_from_event+0x2c/0x2c
[  392.510589]  [<ffffffff810a5a9d>] ? perf_event_disable+0x45/0x8c
[  392.510700]  [<ffffffff810a8e89>] ? __perf_event_overflow+0xf1/0x1a3
[  392.510812]  [<ffffffff8103c7bb>] ? select_task_rq_fair+0x349/0x574
[  392.510924]  [<ffffffff810a7f0a>] ? perf_ctx_adjust_freq+0x42/0xe6
[  392.511038]  [<ffffffff8105f152>] ? sched_clock_cpu+0xb/0xc3
[  392.511152]  [<ffffffff8100ded5>] ? paravirt_read_tsc+0x5/0x8
[  392.511262]  [<ffffffff8100e320>] ? native_sched_clock+0x27/0x2f
[  392.511366]  [<ffffffff810a9500>] ? perf_event_overflow+0x10/0x10
[  392.511476]  [<ffffffff810a959f>] ? perf_swevent_hrtimer+0x9f/0xda
[  392.511599]  [<ffffffff8105ca04>] ? run_posix_cpu_timers+0x23/0x346
[  392.511721]  [<ffffffff8131c2ef>] ? rb_insert_color+0xb1/0xd9
[  392.511841]  [<ffffffff8105d373>] ? __run_hrtimer+0xac/0x135
[  392.511960]  [<ffffffff8105daa3>] ? hrtimer_interrupt+0xdb/0x195
[  392.512083]  [<ffffffff8108d040>] ? check_for_new_grace_period.isra.32+0x99/0xa4
[  392.512220]  [<ffffffff8108d201>] ? __rcu_process_callbacks+0x72/0x2b7
[  392.512345]  [<ffffffff8102437c>] ? hpet_interrupt_handler+0x23/0x2b
[  392.512469]  [<ffffffff81089446>] ? handle_irq_event_percpu+0x50/0x180
[  392.512592]  [<ffffffff81046f76>] ? __do_softirq+0x13e/0x177
[  392.512713]  [<ffffffff810f680c>] ? send_sigio+0x95/0xab
[  392.512832]  [<ffffffff810895aa>] ? handle_irq_event+0x34/0x52
[  392.512952]  [<ffffffff8108b433>] ? handle_edge_irq+0x9f/0xc6
[  392.513072]  [<ffffffff8100a831>] ? handle_irq+0x1d/0x21
[  392.513192]  [<ffffffff8100a561>] ? do_IRQ+0x42/0x98
[  392.513314]  [<ffffffff81593053>] ? common_interrupt+0x13/0x13
[  392.513438]  <EOI> 
[  392.513542] ---[ end trace 12f3f913316a2866 ]---

Thanks,

Vince

View attachment "oflo_sw_cpu_clock_crash.c" of type "TEXT/x-csrc" (3065 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ