lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAMsRxf+g+wtBK2zADN=4U8hE2L+Ox3xyCevXd6aE1wzZ-6EBKw@mail.gmail.com>
Date:	Wed, 28 Oct 2015 08:05:19 +0100
From:	Stephane Eranian <eranian@...glemail.com>
To:	Vince Weaver <vincent.weaver@...ne.edu>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>
Subject: Re: perf: fuzzer triggered trouble on AMD, maybe ibs related

On Thu, Oct 22, 2015 at 6:46 PM, Vince Weaver <vincent.weaver@...ne.edu> wrote:
> Hello
>
> I've been busy but finally had a chance to run perf_fuzzer on current git.
> I am running on an AMD A10 system (my traditional Haswell system is
> otherwise occupied).
>
> I got the following WARNING which was followed by an NMI storm which
> eventually managed to confuse ext4 enough that my / partition was
> remounted read-only? Very alarming.
>
> This is in static void perf_ibs_start(struct perf_event *event, int flags)
>
>         if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
>                 return;
>
Was able to reproduce a similar warning in generic x86 code:

[ 2357.625987] WARNING: CPU: 2 PID: 17152 at
arch/x86/kernel/cpu/perf_event.c:1209 x86_pmu_start+0xa2/0x100()
[ 2357.635775] Modules linked in: cfg80211 snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_intel snd_hda_codec kvm_amd kvm
snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event
snd_rawmidi snd_seq crct10dif_pclmul crc32_pclmul snd_seq_device
snd_timer aesni_intel snd eeepc_wmi asus_wmi aes_x86_64 sparse_keymap
lrw video gf128mul glue_helper edac_mce_amd ablk_helper cryptd shpchp
edac_core wmi soundcore i2c_piix4 serio_raw 8250_fintek k10temp
fam15h_power mac_hid parport_pc ppdev lp parport autofs4 psmouse r8169
ahci libahci mii
[ 2357.687313] CPU: 2 PID: 17152 Comm: perf_fuzzer Not tainted 4.3.0-rc7+ #1
[ 2357.694212] Hardware name: To be filled by O.E.M. To be filled by
O.E.M./M5A97 PRO, BIOS 1604 10/16/2012
[ 2357.703829]  ffffffff81a9f3e0 ffff88021ec83d80 ffffffff8139bed4
0000000000000000
[ 2357.711636]  ffff88021ec83db8 ffffffff81078f26 ffff88021ec8c040
ffff8800c9f85000
[ 2357.719430]  0000000000000001 ffff8802131d4868 ffff8802131d4800
ffff88021ec83dc8
[ 2357.727158] Call Trace:
[ 2357.729657]  <IRQ>  [<ffffffff8139bed4>] dump_stack+0x44/0x60
[ 2357.735573]  [<ffffffff81078f26>] warn_slowpath_common+0x86/0xc0
[ 2357.746342]  [<ffffffff8107901a>] warn_slowpath_null+0x1a/0x20
[ 2357.756968]  [<ffffffff8102b882>] x86_pmu_start+0xa2/0x100
[ 2357.767071]  [<ffffffff81169bd9>] perf_event_task_tick+0x239/0x270
[ 2357.777894]  [<ffffffff810a2c2b>] scheduler_tick+0x7b/0xd0
[ 2357.788053]  [<ffffffff810efbc0>] ? tick_sched_do_timer+0x30/0x30
[ 2357.798693]  [<ffffffff810e0ef1>] update_process_times+0x51/0x60
[ 2357.809102]  [<ffffffff810ef5e5>] tick_sched_handle.isra.15+0x25/0x60
[ 2357.819956]  [<ffffffff810efc00>] tick_sched_timer+0x40/0x70
[ 2357.829943]  [<ffffffff810e1a34>] __hrtimer_run_queues+0xe4/0x200
[ 2357.840398]  [<ffffffff810e1e58>] hrtimer_interrupt+0xa8/0x1a0
[ 2357.850522]  [<ffffffff8104de58>] local_apic_timer_interrupt+0x38/0x60
[ 2357.861370]  [<ffffffff8179cca4>] smp_trace_apic_timer_interrupt+0x44/0xab
[ 2357.872524]  [<ffffffff8179afb2>] trace_apic_timer_interrupt+0x82/0x90
[ 2357.883314]  <EOI>

This can be explained if the event is not in the cpuc->active_mask as
per code in
x86_pmu_stop() vs x86_pmu_start(). I am investigating some more....


> [  359.629045] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/perf_event_amd_ibs.c:372 perf_ibs_start+0x43/0x131()
> [  359.639091] Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc nls_utf8 nls_cp437 vfat fat snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi kvm_amd kvm sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 snd_hda_intel ablk_helper cryptd snd_hda_codec lrw snd_hda_core gf128mul glue_helper ppdev snd_hwdep hp_wmi snd_pcm evdev sparse_keymap snd_timer pl2303 radeon ttm drm_kms_helper tpm_infineon pcspkr drm efivars psmouse serio_raw i2c_piix4 i2c_algo_bit usbserial fb_sys_fops shpchp k10temp parport_pc snd syscopyarea i2c_core parport soundcore tpm_tis wmi sysfillrect button tpm sysimgblt acpi_cpufreq processor sg sr_mod cdrom sd_mod ohci_pci ahci libahci tg3 xhci_pci ptp pps_core libata xhci_hcd ohci_hcd ehci_pci libphy ehci_hcd crc32c_intel
> [  359.711502]  scsi_mod usbcore usb_common
> [  359.714203] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.3.0-rc6+ #12
> [  359.721804] Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS K06 v02.57 08/16/2013
> [  359.730808]  0000000000000006 ffffffff8123e6b7 0000000000000000 ffffffff8104519a
> [  359.738322]  ffffffff8102a003 ffff880224098c00 ffffe8ffffc036d0 ffffffff81824ec0
> [  359.745832]  ffff88022ec0f8e0 ffffffff8102a003 ffff880224098c00 ffffe8ffffc06a70
> [  359.753328] Call Trace:
> [  359.755793]  <IRQ>  [<ffffffff8123e6b7>] ? dump_stack+0x40/0x50
> [  359.761762]  [<ffffffff8104519a>] ? warn_slowpath_common+0x94/0xa9
> [  359.767963]  [<ffffffff8102a003>] ? perf_ibs_start+0x43/0x131
> [  359.773730]  [<ffffffff8102a003>] ? perf_ibs_start+0x43/0x131
> [  359.779495]  [<ffffffff810d8842>] ? perf_event_task_tick+0x101/0x1b5
> [  359.785874]  [<ffffffff8109476c>] ? tick_sched_do_timer+0x24/0x24
> [  359.791990]  [<ffffffff81063628>] ? scheduler_tick+0x64/0x7d
> [  359.797673]  [<ffffffff810896fd>] ? update_process_times+0x3b/0x45
> [  359.803876]  [<ffffffff810942d3>] ? tick_sched_handle+0x3e/0x4a
> [  359.809820]  [<ffffffff8109479b>] ? tick_sched_timer+0x2f/0x53
> [  359.815676]  [<ffffffff81089f55>] ? __hrtimer_run_queues+0xb9/0x18b
> [  359.821967]  [<ffffffff8108a1e8>] ? hrtimer_interrupt+0x61/0x101
> [  359.827995]  [<ffffffff8102d417>] ? smp_apic_timer_interrupt+0x20/0x2f
> [  359.834549]  [<ffffffff8141e58f>] ? apic_timer_interrupt+0x7f/0x90
> [  359.840745]  <EOI>  [<ffffffff8133f769>] ? cpuidle_enter_state+0xf3/0x145
> [  359.847579]  [<ffffffff8106ebab>] ? cpu_startup_entry+0x170/0x1db
> [  359.853694]  [<ffffffff818eddfd>] ? start_kernel+0x40b/0x413
> [  359.859371] ---[ end trace 93964ed985254224 ]---
> [  360.468852] Uhhuh. NMI received for unknown reason 2d on CPU 2.
> [  360.474790] Do you have a strange power saving mode enabled?
> [  360.480454] Dazed and confused, but trying to continue
> [  360.695032] Uhhuh. NMI received for unknown reason 2d on CPU 1.
> [  360.700985] Do you have a strange power saving mode enabled?
> [  360.706666] Dazed and confused, but trying to continue
> [  361.739498] Uhhuh. NMI received for unknown reason 3d on CPU 0.
> [  361.745438] Do you have a strange power saving mode enabled?
> [  361.751104] Dazed and confused, but trying to continue
> [  361.828053] Uhhuh. NMI received for unknown reason 3d on CPU 0.
> [  361.833989] Do you have a strange power saving mode enabled?
> [  361.839677] Dazed and confused, but trying to continue
>
> .....
>
> [  468.763231] Dazed and confused, but trying to continue
> [  468.794184] Uhhuh. NMI received for unknown reason 2d on CPU 2.
> [  468.794184] Do you have a strange power saving mode enabled?
> [  468.794184] Dazed and confused, but trying to continue
> [  473.190535] sd 0:0:0:0: [sda] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> [  473.199631] sd 0:0:0:0: [sda] tag#2 CDB: Write(10) 2a 00 39 93 49 d0 00 00 18 00
> [  473.207789] blk_update_request: I/O error, dev sda, sector 965954000
> [  473.214857] Aborting journal on device sda2-8.
> [  473.214868] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7158 pages, ino 27394094; err -30
> [  473.214880] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7168 pages, ino 27395265; err -30
> [  473.215802] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7168 pages, ino 27394094; err -30
> [  473.215806] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7168 pages, ino 27395265; err -30
> [  473.215811] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7168 pages, ino 27394094; err -30
> [  473.215814] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7168 pages, ino 27395265; err -30
> [  473.215849] EXT4-fs (sda2): ext4_writepages: jbd2_start: 9223372036854775807 pages, ino 27394094; err -30
> [  473.215859] EXT4-fs (sda2): ext4_writepages: jbd2_start: 9223372036854775807 pages, ino 27395265; err -30
> [  473.409076] EXT4-fs error (device sda2): ext4_journal_check_start:56: Detected aborted journal
> [  473.419003] EXT4-fs (sda2): Remounting filesystem read-only
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ