lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 19 Jun 2017 16:37:41 +0300
From:   Alexey Budankov <alexey.budankov@...ux.intel.com>
To:     Mark Rutland <mark.rutland@....com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Kan Liang <kan.liang@...el.com>,
        Dmitri Prokhorov <Dmitry.Prohorov@...el.com>,
        Valery Cherepennikov <valery.cherepennikov@...el.com>,
        David Carrillo-Cisneros <davidcc@...gle.com>,
        Stephane Eranian <eranian@...gle.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/n] perf/core: addressing 4x slowdown during
 per-process profiling of STREAM benchmark on Intel Xeon Phi

On 19.06.2017 16:26, Mark Rutland wrote:
> On Mon, Jun 19, 2017 at 04:08:32PM +0300, Alexey Budankov wrote:
>> On 16.06.2017 1:10, Alexey Budankov wrote:
>>> On 15.06.2017 22:56, Mark Rutland wrote:
>>>> On Thu, Jun 15, 2017 at 08:41:42PM +0300, Alexey Budankov wrote:
>>>>> This series of patches continues v2 and addresses captured comments.
> 
>>>>> Specifically this patch replaces pinned_groups and flexible_groups
>>>>> lists of perf_event_context by red-black cpu indexed trees avoiding
>>>>> data structures duplication and introducing possibility to iterate
>>>>> event groups for a specific CPU only.
> 
>>>> Have you thrown Vince's perf fuzzer at this?
>>>>
>>>> If you haven't, please do. It can be found in the fuzzer directory of:
>>>>
>>>> https://github.com/deater/perf_event_tests
>>>
>>> Accepted.
>>
>> I run the test suite and it revealed no additional regressions in
>> comparison to what I have on the clean kernel.
>>
>> However the fuzzer constantly reports some strange stacks that are
>> not seen on the clean kernel and I have no idea how that might be
>> caused by the patches.
> 
> Ok; that was the kind of thing I was concerned about.
> 
> What you say "strange stacks", what do you mean exactly?
> 
> I take it the kernel spewing backtraces in dmesg?
> 
> Can you dump those?

Here it is:

list_del corruption. prev->next should be ffff88c2c4654010, but was 
ffff88c31eb0c020
[  607.632813] ------------[ cut here ]------------
[  607.632816] kernel BUG at lib/list_debug.c:53!
[  607.632825] invalid opcode: 0000 [#1] SMP
[  607.632898] Modules linked in: fuse xt_CHECKSUM iptable_mangle 
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c tun 
bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables nfsv3 
rpcsec_gss_krb5 nfsv4 arc4 md4 nls_utf8 cifs nfs ccm dns_resolver 
fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi 
scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp 
ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm 
intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp hfi1 coretemp 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate iTCO_wdt 
intel_uncore iTCO_vendor_support joydev rdmavt intel_rapl_perf i2c_i801 
ib_core ipmi_ssif mei_me mei ipmi_si ipmi_devintf tpm_tis
[  607.633954]  lpc_ich pcspkr ipmi_msghandler acpi_pad tpm_tis_core 
shpchp tpm wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace 
sunrpc mgag200 drm_kms_helper ttm drm igb crc32c_intel ptp pps_core dca 
i2c_algo_bit
[  607.634262] CPU: 271 PID: 28944 Comm: perf_fuzzer Not tainted 
4.12.0-rc4+ #22
[  607.634363] Hardware name: Intel Corporation S7200AP/S7200AP, BIOS 
S72C610.86B.01.01.0190.080520162104 08/05/2016
[  607.634505] task: ffff88c2d5714000 task.stack: ffffa6f9352c8000
[  607.634597] RIP: 0010:__list_del_entry_valid+0x7b/0x90
[  607.634670] RSP: 0000:ffffa6f9352cbad0 EFLAGS: 00010046
[  607.634746] RAX: 0000000000000054 RBX: ffff88c2c4654000 RCX: 
0000000000000000
[  607.634845] RDX: 0000000000000000 RSI: ffff88c33fdce168 RDI: 
ffff88c33fdce168
[  607.634944] RBP: ffffa6f9352cbad0 R08: 00000000fffffffe R09: 
0000000000000600
[  607.635042] R10: 0000000000000005 R11: 0000000000000000 R12: 
ffff88c2e71ab200
[  607.635140] R13: ffff88c2c4654010 R14: 0000000000000001 R15: 
0000000000000001
[  607.635240] FS:  0000000000000000(0000) GS:ffff88c33fdc0000(0000) 
knlGS:0000000000000000
[  607.635351] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  607.635431] CR2: 00000000026be1e4 CR3: 0000002488e09000 CR4: 
00000000001407e0
[  607.635531] Call Trace:
[  607.635583]  list_del_event+0x1d7/0x210
[  607.635646]  ? perf_cgroup_attach+0x70/0x70
[  607.635711]  __perf_remove_from_context+0x3e/0x90
[  607.635783]  ? event_sched_out.isra.90+0x300/0x300
[  607.635854]  event_function_call+0xbf/0xf0
[  607.635918]  ? event_sched_out.isra.90+0x300/0x300
[  607.635991]  perf_remove_from_context+0x25/0x70
[  607.636060]  perf_event_release_kernel+0xda/0x250
[  607.636132]  ? __dentry_kill+0x10e/0x160
[  607.636192]  perf_release+0x10/0x20
[  607.636249]  __fput+0xdf/0x1e0
[  607.636299]  ____fput+0xe/0x10
[  607.636350]  task_work_run+0x83/0xb0
[  607.636408]  do_exit+0x2bc/0xbc0
[  607.636460]  ? page_add_file_rmap+0xaf/0x200
[  607.636526]  ? alloc_set_pte+0x115/0x4f0
[  607.636587]  do_group_exit+0x3f/0xb0
[  607.636644]  get_signal+0x1cc/0x5c0
[  607.636703]  do_signal+0x37/0x6a0
[  607.636758]  ? __perf_sw_event+0x4f/0x80
[  607.636821]  ? __do_page_fault+0x2e1/0x4d0
[  607.636885]  exit_to_usermode_loop+0x4c/0x92
[  607.636953]  prepare_exit_to_usermode+0x40/0x50
[  607.637023]  retint_user+0x8/0x13
[  607.640312] RIP: 0033:0x40f4a9
[  607.643500] RSP: 002b:00007ffc62d00668 EFLAGS: 00000206 ORIG_RAX: 
ffffffffffffff02
[  607.646678] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
000000000001100a
[  607.649777] RDX: 00007fd5dd25bae0 RSI: 00007fd5dd259760 RDI: 
00007fd5dd25a640
[  607.652791] RBP: 00007ffc62d00680 R08: 00007fd5dd45e740 R09: 
0000000000000000
[  607.655709] R10: 00007fd5dd45ea10 R11: 0000000000000246 R12: 
0000000000401980
[  607.658530] R13: 00007ffc62d02a80 R14: 0000000000000000 R15: 
0000000000000000
[  607.661249] Code: e8 3a f7 d8 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 20 66 
cb ac e8 27 f7 d8 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 e0 65 cb ac e8 14 f7 
d8 ff <0f> 0b 48 89 fe 31 c0 48 c7 c7 a8 65 cb ac e8 01 f7 d8 ff 0f 0b
[  607.666819] RIP: __list_del_entry_valid+0x7b/0x90 RSP: ffffa6f9352cbad0
[  607.683316] ---[ end trace 34244c35550e0713 ]---
[  607.691830] Fixing recursive fault but reboot is needed!

2:

[  467.942059] unchecked MSR access error: WRMSR to 0x711 (tried to 
write 0x00000000e8cc0055) at rIP: 0xffffffffac05fbd4 
(native_write_msr+0x4/0x30)
[  467.942068] Call Trace:
[  467.942073]  <IRQ>
[  467.942094]  ? snbep_uncore_msr_enable_event+0x54/0x60 [intel_uncore]
[  467.942107]  uncore_pmu_event_start+0x9b/0x100 [intel_uncore]
[  467.942119]  uncore_pmu_event_add+0x235/0x3a0 [intel_uncore]
[  467.942126]  ? sched_clock+0xb/0x10
[  467.942132]  ? sched_clock_cpu+0x11/0xb0
[  467.942140]  event_sched_in.isra.100+0xdf/0x250
[  467.942145]  sched_in_group+0x210/0x390
[  467.942150]  ? sched_in_group+0x390/0x390
[  467.942155]  group_sched_in_flexible_callback+0x17/0x20
[  467.942160]  perf_cpu_tree_iterate+0x45/0x75
[  467.942165]  ctx_sched_in+0x97/0x110
[  467.942169]  perf_event_sched_in+0x77/0x80
[  467.942174]  ctx_resched+0x69/0xb0
[  467.942179]  __perf_event_enable+0x208/0x250
[  467.942184]  event_function+0x93/0xe0
[  467.942188]  remote_function+0x3b/0x50
[  467.942194]  flush_smp_call_function_queue+0x71/0x120
[  467.942200]  generic_smp_call_function_single_interrupt+0x13/0x30
[  467.942206]  smp_call_function_single_interrupt+0x27/0x40
[  467.942212]  call_function_single_interrupt+0x93/0xa0
[  467.942217] RIP: 0010:native_restore_fl+0x6/0x10
[  467.942220] RSP: 0000:ffff88c33ba03e00 EFLAGS: 00000206 ORIG_RAX: 
ffffffffffffff04
[  467.942224] RAX: ffff88c33c8dca08 RBX: ffff88c33c8dc928 RCX: 
0000000000000017
[  467.942227] RDX: 0000000000000000 RSI: 0000000000000206 RDI: 
0000000000000206
[  467.942229] RBP: ffff88c33ba03e00 R08: 0000000000000001 R09: 
0000000000007151
[  467.942232] R10: 0000000000000000 R11: 000000000000005d R12: 
ffff88c33c8dca08
[  467.942234] R13: ffff88c33c8dc140 R14: 0000000000000001 R15: 
00000000000001d8
[  467.942241]  _raw_spin_unlock_irqrestore+0x16/0x20
[  467.942245]  update_blocked_averages+0x2cf/0x4a0
[  467.942251]  rebalance_domains+0x4b/0x2b0
[  467.942256]  run_rebalance_domains+0x1d7/0x210
[  467.942260]  __do_softirq+0xd1/0x27f
[  467.942267]  irq_exit+0xe9/0x100
[  467.942271]  scheduler_ipi+0x8f/0x140
[  467.942275]  smp_reschedule_interrupt+0x29/0x30
[  467.942280]  reschedule_interrupt+0x93/0xa0
[  467.942284] RIP: 0010:native_safe_halt+0x6/0x10
[  467.942286] RSP: 0000:fffffffface03de0 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffff02
[  467.942291] RAX: 0000000000000000 RBX: fffffffface104c0 RCX: 
0000000000000000
[  467.942293] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[  467.942295] RBP: fffffffface03de0 R08: 0000006d2c5efae3 R09: 
0000000000000000
[  467.942298] R10: 0000000000000201 R11: 0000000000000930 R12: 
0000000000000000
[  467.942300] R13: fffffffface104c0 R14: 0000000000000000 R15: 
0000000000000000
[  467.942302]  </IRQ>
[  467.942308]  default_idle+0x20/0x100
[  467.942313]  arch_cpu_idle+0xf/0x20
[  467.942317]  default_idle_call+0x2c/0x40
[  467.942321]  do_idle+0x158/0x1e0
[  467.942325]  cpu_startup_entry+0x71/0x80
[  467.942330]  rest_init+0x77/0x80
[  467.942338]  start_kernel+0x4a7/0x4c8
[  467.942342]  ? set_init_arg+0x5a/0x5a
[  467.942348]  ? early_idt_handler_array+0x120/0x120
[  467.942352]  x86_64_start_reservations+0x29/0x2b
[  467.942357]  x86_64_start_kernel+0x151/0x174
[  467.942363]  secondary_startup_64+0x9f/0x9f


> 
> Thanks,
> Mark.
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ