lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 9 Jul 2018 12:06:45 +0200
From:   Claudio <claudio.fontana@...wa.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org
Subject: Re: ftrace performance (sched events): cyclictest shows 25% more
 latency

Hello Steve,

thank you for your response,

On 07/06/2018 11:24 PM, Steven Rostedt wrote:
> On Fri, 6 Jul 2018 08:22:01 +0200
> Claudio <claudio.fontana@...wa.com> wrote:
> 
>> Hello all,
>>
>> I have been experimenting with the idea of leaving ftrace enabled, with sched events,
>> on production systems.
>>
>> The main concern that I am having at the moment is about the impact on the system.
>> Enabling the sched events that I currently need for the tracing application
>> seems to slow down context-switches considerably, and make the system less responsive.
>>
>> I have tested with cyclictest on the mainline kernel, and noticed an increase of min, avg latencies of around 25%.
>>
>> Is this expected?
> 
> No it is not. And this is weird. I'm seeing it too.
> 
>>
>> Some initial investigation into ftrace seems to point at the reservation and commit of the events into the ring buffer
>> as the highest sources of overhead, while event parameters copying, including COMM, does not seem to have any noticeable effect
>> relative to those costs.
> 
> Did you use perf to see where the issues were? If so what perf command
> did you use?

I did not use perf, I will now do it and see what I get.
You can discard my comment regarding reservation and commit, it was based on a wrong premise.

Regarding parameter copying, I just brutally modified the code to not fill fields (TP_fast_assign),
and I just noticed that it did not change the result significantly.

> 
> When running just:
>   perf record ./cyclictest --smp -p95 -m -s -N -l 1000000 -q
> 
> I get this:
> 
>    6.08%  cyclictest  [kernel.vmlinux]    [k] syscall_return_via_sysret
>    4.03%  cyclictest  cyclictest          [.] timerthread
>    3.97%  cyclictest  [kernel.vmlinux]    [k] __schedule
>    3.81%  cyclictest  [kernel.vmlinux]    [k] native_write_msr
>    3.13%  cyclictest  [kernel.vmlinux]    [k] trace_save_cmdline
>    2.71%  cyclictest  [kernel.vmlinux]    [k] do_nanosleep
>    2.48%  cyclictest  [kernel.vmlinux]    [k] do_syscall_64
>    2.11%  cyclictest  [kernel.vmlinux]    [k] pick_next_task_fair
>    2.03%  cyclictest  [kernel.vmlinux]    [k] native_sched_clock
>    1.99%  cyclictest  [kernel.vmlinux]    [k] cpuacct_charge
>    1.93%  cyclictest  [vdso]              [.] __vdso_clock_gettime
>    1.86%  cyclictest  [kernel.vmlinux]    [k] hrtimer_nanosleep
>    1.84%  cyclictest  [kernel.vmlinux]    [k] cpupri_set
>    1.66%  cyclictest  [kernel.vmlinux]    [k] update_curr_rt
>    1.65%  cyclictest  [kernel.vmlinux]    [k] __perf_event_task_sched_out
>    1.59%  cyclictest  [kernel.vmlinux]    [k] dequeue_rt_stack
>    1.55%  cyclictest  [kernel.vmlinux]    [k] __rb_reserve_next
> 
> Which shows there's a lot being done before we even get to the ring
> buffer reserve.

I get (CONFIG_PREEMPT_VOLUNTARY):

SCHED EVENTS DISABLED:

     8.57%  cyclictest  [kernel.vmlinux]    [k] syscall_return_via_sysret
     7.04%  cyclictest  [kernel.vmlinux]    [k] native_write_msr
     6.93%  cyclictest  [kernel.vmlinux]    [k] entry_SYSCALL_64_trampoline
     4.42%  cyclictest  cyclictest          [.] timerthread
     4.14%  cyclictest  [kernel.vmlinux]    [k] __schedule
     3.52%  cyclictest  [kernel.vmlinux]    [k] cpupri_set
     2.90%  cyclictest  [kernel.vmlinux]    [k] finish_task_switch
     2.42%  cyclictest  [kernel.vmlinux]    [k] update_curr_rt
     2.34%  cyclictest  [kernel.vmlinux]    [k] _raw_spin_lock
     2.28%  cyclictest  [kernel.vmlinux]    [k] ktime_get
     2.14%  cyclictest  [kernel.vmlinux]    [k] timerqueue_add
     2.06%  cyclictest  [kernel.vmlinux]    [k] read_tsc
     2.06%  cyclictest  [kernel.vmlinux]    [k] native_sched_clock
     2.01%  cyclictest  [vdso]              [.] __vdso_clock_gettime
     1.95%  cyclictest  [kernel.vmlinux]    [k] do_nanosleep
     1.93%  cyclictest  [kernel.vmlinux]    [k] pick_next_task_fair
     1.89%  cyclictest  [vdso]              [.] 0x0000000000000977
     1.75%  cyclictest  [kernel.vmlinux]    [k] dequeue_task_rt
     1.64%  cyclictest  [kernel.vmlinux]    [k] x86_pmu_disable
     1.57%  cyclictest  [kernel.vmlinux]    [k] dequeue_rt_stack
     1.53%  cyclictest  [kernel.vmlinux]    [k] __x64_sys_nanosleep
     1.51%  cyclictest  [kernel.vmlinux]    [k] do_syscall_64
     1.41%  cyclictest  [kernel.vmlinux]    [k] put_prev_task_rt
     1.34%  cyclictest  [kernel.vmlinux]    [k] pick_next_task_dl
     1.28%  cyclictest  [kernel.vmlinux]    [k] pick_next_task_rt
     1.25%  cyclictest  [kernel.vmlinux]    [k] cpuacct_charge
     1.25%  cyclictest  [kernel.vmlinux]    [k] hrtimer_start_range_ns
     1.24%  cyclictest  [kernel.vmlinux]    [k] __x86_indirect_thunk_rax
     1.07%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c1b
     0.94%  cyclictest  [kernel.vmlinux]    [k] pull_rt_task
     0.93%  cyclictest  [kernel.vmlinux]    [k] entry_SYSCALL_64_stage2
     0.89%  cyclictest  [kernel.vmlinux]    [k] hrtimer_nanosleep
     0.89%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c1d
     0.88%  cyclictest  [kernel.vmlinux]    [k] entry_SYSCALL_64
     0.79%  cyclictest  [kernel.vmlinux]    [k] _raw_spin_lock_irqsave
     0.79%  cyclictest  [kernel.vmlinux]    [k] ctx_sched_out
     0.78%  cyclictest  [kernel.vmlinux]    [k] copy_user_generic_unrolled
     0.77%  cyclictest  libpthread-2.23.so  [.] __pthread_enable_asynccancel
     0.71%  cyclictest  [kernel.vmlinux]    [k] __perf_event_task_sched_out
     0.71%  cyclictest  [kernel.vmlinux]    [k] pick_next_task_idle
     0.70%  cyclictest  libpthread-2.23.so  [.] __pthread_disable_asynccancel
     0.70%  cyclictest  [kernel.vmlinux]    [k] __hrtimer_init
     0.68%  cyclictest  [kernel.vmlinux]    [k] clear_page_erms
     0.64%  cyclictest  [kernel.vmlinux]    [k] intel_pmu_disable_all
     0.58%  cyclictest  [kernel.vmlinux]    [k] copy_user_enhanced_fast_string
     0.58%  cyclictest  [kernel.vmlinux]    [k] _copy_from_user
     0.58%  cyclictest  [kernel.vmlinux]    [k] _raw_spin_unlock_irqrestore
     0.54%  cyclictest  [kernel.vmlinux]    [k] get_nohz_timer_target
     0.54%  cyclictest  [kernel.vmlinux]    [k] pick_next_task_stop
     0.48%  cyclictest  [kernel.vmlinux]    [k] schedule
     0.47%  cyclictest  [kernel.vmlinux]    [k] lapic_next_deadline
     0.43%  cyclictest  [kernel.vmlinux]    [k] sched_avg_update
     0.42%  cyclictest  [kernel.vmlinux]    [k] __perf_event_task_sched_in
     0.37%  cyclictest  libc-2.23.so        [.] __clock_gettime
     0.37%  cyclictest  [vdso]              [.] 0x0000000000000974
     0.35%  cyclictest  [kernel.vmlinux]    [k] get_timespec64
     0.32%  cyclictest  [kernel.vmlinux]    [k] dequeue_top_rt_rq
     0.31%  cyclictest  [kernel.vmlinux]    [k] try_to_unmap_one
     0.27%  cyclictest  [kernel.vmlinux]    [k] update_rq_clock
     0.24%  cyclictest  [kernel.vmlinux]    [k] rb_insert_color
     0.23%  cyclictest  [kernel.vmlinux]    [k] deactivate_task
     0.22%  cyclictest  [kernel.vmlinux]    [k] follow_page_pte
     0.21%  cyclictest  [kernel.vmlinux]    [k] page_not_mapped
     0.20%  cyclictest  [kernel.vmlinux]    [k] sched_clock_cpu
     0.18%  cyclictest  cyclictest          [.] main
     0.17%  cyclictest  [kernel.vmlinux]    [k] clockevents_program_event
     0.17%  cyclictest  [kernel.vmlinux]    [k] hrtimer_reprogram
     0.16%  cyclictest  [kernel.vmlinux]    [k] find_busiest_group
     0.15%  cyclictest  [kernel.vmlinux]    [k] enqueue_hrtimer
     0.15%  cyclictest  libc-2.23.so        [.] __nanosleep
     0.14%  cyclictest  cyclictest          [.] clock_gettime@plt
     0.14%  cyclictest  [kernel.vmlinux]    [k] rcu_note_context_switch
     0.14%  cyclictest  [kernel.vmlinux]    [k] __intel_pmu_enable_all
     0.13%  cyclictest  [kernel.vmlinux]    [k] hrtimer_active
     0.13%  cyclictest  libpthread-2.23.so  [.] __nanosleep
     0.13%  cyclictest  [kernel.vmlinux]    [k] __msecs_to_jiffies
     0.13%  cyclictest  [kernel.vmlinux]    [k] rcu_all_qs
     0.13%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c24
     0.13%  cyclictest  [kernel.vmlinux]    [k] hrtimer_try_to_cancel
     0.13%  cyclictest  cyclictest          [.] nanosleep@plt
     0.12%  cyclictest  [kernel.vmlinux]    [k] __intel_pmu_disable_all
     0.12%  cyclictest  [kernel.vmlinux]    [k] _cond_resched
     0.10%  cyclictest  cyclictest          [.] print_stat
     0.10%  cyclictest  [kernel.vmlinux]    [k] perf_ctx_unlock
     0.10%  cyclictest  [kernel.vmlinux]    [k] sched_clock
     0.10%  cyclictest  [kernel.vmlinux]    [k] free_unref_page_list
     0.09%  cyclictest  [vdso]              [.] 0x0000000000000987
     0.09%  cyclictest  [kernel.vmlinux]    [k] update_cfs_group
     0.08%  cyclictest  [kernel.vmlinux]    [k] copy_semundo
     0.08%  cyclictest  [kernel.vmlinux]    [k] update_blocked_averages
     0.07%  cyclictest  [kernel.vmlinux]    [k] tick_program_event
     0.07%  cyclictest  [kernel.vmlinux]    [k] dequeue_task_fair
     0.07%  cyclictest  [kernel.vmlinux]    [k] __handle_mm_fault
     0.06%  cyclictest  [kernel.vmlinux]    [k] plist_del
     0.06%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c0d
     0.06%  cyclictest  [kernel.vmlinux]    [k] put_prev_task_fair
     0.06%  cyclictest  [vdso]              [.] 0x0000000000000979
     0.06%  cyclictest  [kernel.vmlinux]    [k] munlock_vma_pages_range
     0.05%  cyclictest  [vdso]              [.] 0x000000000000097d
     0.05%  cyclictest  [kernel.vmlinux]    [k] __lru_cache_add
     0.05%  cyclictest  [kernel.vmlinux]    [k] update_load_avg
     0.05%  cyclictest  [kernel.vmlinux]    [k] pick_next_entity
     0.04%  cyclictest  [kernel.vmlinux]    [k] update_rt_migration
     0.04%  cyclictest  [kernel.vmlinux]    [k] pagevec_lru_move_fn
     0.04%  cyclictest  [kernel.vmlinux]    [k] reusable_anon_vma
     0.04%  cyclictest  [kernel.vmlinux]    [k] pmd_huge
     0.04%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c21
     0.04%  cyclictest  ld-2.23.so          [.] strcmp
     0.04%  cyclictest  libc-2.23.so        [.] __call_tls_dtors
     0.04%  cyclictest  [kernel.vmlinux]    [k] free_unref_page_commit
     0.04%  cyclictest  [kernel.vmlinux]    [k] unmap_page_range
     0.04%  cyclictest  [kernel.vmlinux]    [k] PageHeadHuge
     0.04%  cyclictest  [kernel.vmlinux]    [k] vma_interval_tree_iter_next
     0.04%  cyclictest  [kernel.vmlinux]    [k] rmap_walk_file
     0.04%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c12
     0.04%  cyclictest  [kernel.vmlinux]    [k] __update_load_avg_se.isra.41
     0.03%  cyclictest  [kernel.vmlinux]    [k] __munlock_pagevec
     0.03%  cyclictest  libc-2.23.so        [.] _itoa_word
     0.03%  cyclictest  [kernel.vmlinux]    [k] __pagevec_lru_add_fn
     0.03%  cyclictest  [kernel.vmlinux]    [k] ext4_file_mmap
     0.03%  cyclictest  [kernel.vmlinux]    [k] free_pcppages_bulk
     0.02%  cyclictest  [kernel.vmlinux]    [k] lock_hrtimer_base.isra.19
     0.02%  cyclictest  [kernel.vmlinux]    [k] __mod_node_page_state
     0.02%  cyclictest  [kernel.vmlinux]    [k] rb_erase
     0.02%  cyclictest  [kernel.vmlinux]    [k] release_pages
     0.02%  cyclictest  [kernel.vmlinux]    [k] clear_buddies
     0.02%  cyclictest  [kernel.vmlinux]    [k] __hrtimer_next_event_base
     0.02%  cyclictest  [kernel.vmlinux]    [k] swapgs_restore_regs_and_return_t
     0.02%  cyclictest  [kernel.vmlinux]    [k] reweight_entity
     0.02%  cyclictest  [kernel.vmlinux]    [k] update_curr
     0.02%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c29
     0.02%  cyclictest  libpthread-2.23.so  [.] start_thread
     0.02%  cyclictest  [kernel.vmlinux]    [k] xhci_queue_intr_tx
     0.02%  cyclictest  [kernel.vmlinux]    [k] __acct_update_integrals
     0.02%  cyclictest  [vdso]              [.] 0x0000000000000971
     0.02%  cyclictest  [kernel.vmlinux]    [k] xhci_check_args
     0.01%  cyclictest  [kernel.vmlinux]    [k] idle_cpu
     0.01%  cyclictest  [kernel.vmlinux]    [k] scheduler_tick
     0.01%  cyclictest  [kernel.vmlinux]    [k] __delayacct_add_tsk
     0.01%  cyclictest  [kernel.vmlinux]    [k] perf_pmu_disable.part.98
     0.01%  cyclictest  [kernel.vmlinux]    [k] end_repeat_nmi
     0.00%  cyclictest  [kernel.vmlinux]    [k] native_irq_return_iret
     0.00%  cyclictest  [kernel.vmlinux]    [k] nmi_restore
     0.00%  cyclictest  [kernel.vmlinux]    [k] do_nmi
     0.00%  cyclictest  [kernel.vmlinux]    [k] nmi_handle
     0.00%  perf        [kernel.vmlinux]    [k] perf_event_addr_filters_exec
     0.00%  cyclictest  [kernel.vmlinux]    [k] perf_event_nmi_handler
     0.00%  perf        [kernel.vmlinux]    [k] native_write_msr
     0.00%  cyclictest  [unknown]           [.] 0000000000000000

SCHED EVENTS ENABLED:

     7.33%  cyclictest  [kernel.vmlinux]    [k] syscall_return_via_sysret
     6.51%  cyclictest  [kernel.vmlinux]    [k] entry_SYSCALL_64_trampoline
     5.84%  cyclictest  [kernel.vmlinux]    [k] native_write_msr
     4.00%  cyclictest  cyclictest          [.] timerthread
     3.62%  cyclictest  [kernel.vmlinux]    [k] __schedule
     2.71%  cyclictest  [kernel.vmlinux]    [k] cpupri_set
     2.57%  cyclictest  [kernel.vmlinux]    [k] finish_task_switch
     2.46%  cyclictest  [kernel.vmlinux]    [k] native_sched_clock
     2.19%  cyclictest  [kernel.vmlinux]    [k] __x86_indirect_thunk_rax
     2.03%  cyclictest  [kernel.vmlinux]    [k] ktime_get
     2.01%  cyclictest  [kernel.vmlinux]    [k] pick_next_task_fair
     1.93%  cyclictest  [kernel.vmlinux]    [k] _raw_spin_lock
     1.90%  cyclictest  [kernel.vmlinux]    [k] timerqueue_add
     1.70%  cyclictest  [kernel.vmlinux]    [k] do_nanosleep
     1.67%  cyclictest  [kernel.vmlinux]    [k] read_tsc
     1.65%  cyclictest  [kernel.vmlinux]    [k] cpuacct_charge
     1.60%  cyclictest  [kernel.vmlinux]    [k] update_curr_rt
     1.49%  cyclictest  [kernel.vmlinux]    [k] probe_sched_switch
     1.40%  cyclictest  [kernel.vmlinux]    [k] __x64_sys_nanosleep
     1.34%  cyclictest  [kernel.vmlinux]    [k] dequeue_rt_stack
     1.28%  cyclictest  [kernel.vmlinux]    [k] do_syscall_64
     1.27%  cyclictest  [kernel.vmlinux]    [k] x86_pmu_disable
     1.24%  cyclictest  [kernel.vmlinux]    [k] __rb_reserve_next
     1.23%  cyclictest  [vdso]              [.] 0x0000000000000977
     1.21%  cyclictest  [vdso]              [.] __vdso_clock_gettime
     1.06%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c1d
     1.06%  cyclictest  [kernel.vmlinux]    [k] entry_SYSCALL_64_stage2
     1.03%  cyclictest  [kernel.vmlinux]    [k] pick_next_task_dl
     1.00%  cyclictest  [kernel.vmlinux]    [k] put_prev_task_rt
     1.00%  cyclictest  [kernel.vmlinux]    [k] pick_next_task_rt
     0.98%  cyclictest  [kernel.vmlinux]    [k] trace_event_raw_event_sched_swit
     0.97%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c1b
     0.95%  cyclictest  [kernel.vmlinux]    [k] pick_next_task_idle
     0.91%  cyclictest  [kernel.vmlinux]    [k] trace_save_cmdline
     0.90%  cyclictest  [kernel.vmlinux]    [k] copy_user_generic_unrolled
     0.90%  cyclictest  [kernel.vmlinux]    [k] ctx_sched_out
     0.90%  cyclictest  [kernel.vmlinux]    [k] lapic_next_deadline
     0.86%  cyclictest  [kernel.vmlinux]    [k] intel_pmu_disable_all
     0.85%  cyclictest  [kernel.vmlinux]    [k] rb_commit
     0.84%  cyclictest  [kernel.vmlinux]    [k] entry_SYSCALL_64
     0.82%  cyclictest  [kernel.vmlinux]    [k] _raw_spin_lock_irqsave
     0.80%  cyclictest  [kernel.vmlinux]    [k] hrtimer_nanosleep
     0.76%  cyclictest  [kernel.vmlinux]    [k] _copy_from_user
     0.75%  cyclictest  [kernel.vmlinux]    [k] __hrtimer_init
     0.72%  cyclictest  [kernel.vmlinux]    [k] ring_buffer_lock_reserve
     0.69%  cyclictest  [kernel.vmlinux]    [k] __perf_event_task_sched_out
     0.66%  cyclictest  [kernel.vmlinux]    [k] hrtimer_start_range_ns
     0.63%  cyclictest  [kernel.vmlinux]    [k] get_nohz_timer_target
     0.62%  cyclictest  [kernel.vmlinux]    [k] clear_page_erms
     0.61%  cyclictest  libpthread-2.23.so  [.] __pthread_disable_asynccancel
     0.59%  cyclictest  [kernel.vmlinux]    [k] dequeue_task_rt
     0.58%  cyclictest  [kernel.vmlinux]    [k] schedule
     0.56%  cyclictest  [kernel.vmlinux]    [k] pull_rt_task
     0.55%  cyclictest  [kernel.vmlinux]    [k] _raw_spin_unlock_irqrestore
     0.54%  cyclictest  [kernel.vmlinux]    [k] rb_insert_color
     0.52%  cyclictest  [kernel.vmlinux]    [k] trace_event_buffer_reserve
     0.51%  cyclictest  [kernel.vmlinux]    [k] trace_clock_local
     0.50%  cyclictest  [kernel.vmlinux]    [k] pick_next_task_stop
     0.48%  cyclictest  [kernel.vmlinux]    [k] __perf_event_task_sched_in
     0.43%  cyclictest  [kernel.vmlinux]    [k] rcu_note_context_switch
     0.41%  cyclictest  [kernel.vmlinux]    [k] trace_event_buffer_commit
     0.41%  cyclictest  [kernel.vmlinux]    [k] deactivate_task
     0.39%  cyclictest  [kernel.vmlinux]    [k] hrtimer_reprogram
     0.36%  cyclictest  libpthread-2.23.so  [.] __pthread_enable_asynccancel
     0.36%  cyclictest  libc-2.23.so        [.] __clock_gettime
     0.34%  cyclictest  [kernel.vmlinux]    [k] tracing_generic_entry_update
     0.34%  cyclictest  [kernel.vmlinux]    [k] enqueue_hrtimer
     0.29%  cyclictest  [kernel.vmlinux]    [k] dequeue_top_rt_rq
     0.29%  cyclictest  [kernel.vmlinux]    [k] perf_ctx_unlock
     0.28%  cyclictest  [kernel.vmlinux]    [k] copy_user_enhanced_fast_string
     0.27%  cyclictest  libpthread-2.23.so  [.] __nanosleep
     0.26%  cyclictest  [kernel.vmlinux]    [k] ring_buffer_unlock_commit
     0.25%  cyclictest  [kernel.vmlinux]    [k] trace_event_buffer_lock_reserve
     0.24%  cyclictest  [kernel.vmlinux]    [k] get_timespec64
     0.23%  cyclictest  [kernel.vmlinux]    [k] find_busiest_group
     0.22%  cyclictest  cyclictest          [.] print_stat
     0.21%  cyclictest  [kernel.vmlinux]    [k] free_unref_page_list
     0.21%  cyclictest  [kernel.vmlinux]    [k] __intel_pmu_enable_all
     0.21%  cyclictest  [kernel.vmlinux]    [k] rb_move_tail
     0.20%  cyclictest  [kernel.vmlinux]    [k] ring_buffer_event_data
     0.20%  cyclictest  [kernel.vmlinux]    [k] vma_interval_tree_subtree_search
     0.20%  cyclictest  [kernel.vmlinux]    [k] sched_avg_update
     0.19%  cyclictest  [kernel.vmlinux]    [k] tick_program_event
     0.19%  cyclictest  [kernel.vmlinux]    [k] follow_page_mask
     0.19%  cyclictest  [kernel.vmlinux]    [k] sched_clock_cpu
     0.19%  cyclictest  [kernel.vmlinux]    [k] tracing_record_taskinfo_sched_sw
     0.16%  cyclictest  [kernel.vmlinux]    [k] get_page_from_freelist
     0.16%  cyclictest  [kernel.vmlinux]    [k] clockevents_program_event
     0.15%  cyclictest  [vdso]              [.] 0x0000000000000974
     0.15%  cyclictest  [kernel.vmlinux]    [k] hrtimer_try_to_cancel
     0.14%  cyclictest  [kernel.vmlinux]    [k] __alloc_pages_nodemask
     0.13%  cyclictest  cyclictest          [.] main
     0.13%  cyclictest  ld-2.23.so          [.] _dl_sysdep_start
     0.13%  cyclictest  [kernel.vmlinux]    [k] rmap_walk_file
     0.12%  cyclictest  [kernel.vmlinux]    [k] update_rq_clock
     0.12%  cyclictest  libc-2.23.so        [.] __libc_disable_asynccancel
     0.12%  cyclictest  [vdso]              [.] 0x0000000000000970
     0.12%  cyclictest  [kernel.vmlinux]    [k] hrtimer_active
     0.11%  cyclictest  [kernel.vmlinux]    [k] page_not_mapped
     0.11%  cyclictest  [kernel.vmlinux]    [k] sched_clock
     0.10%  cyclictest  [kernel.vmlinux]    [k] __mod_zone_page_state
     0.10%  cyclictest  [kernel.vmlinux]    [k] account_entity_dequeue
     0.09%  cyclictest  libc-2.23.so        [.] __nanosleep
     0.09%  cyclictest  [kernel.vmlinux]    [k] rcu_all_qs
     0.09%  cyclictest  [kernel.vmlinux]    [k] _cond_resched
     0.08%  cyclictest  [kernel.vmlinux]    [k] ring_buffer_record_is_on
     0.08%  cyclictest  [kernel.vmlinux]    [k] dequeue_entity
     0.08%  cyclictest  [kernel.vmlinux]    [k] lock_hrtimer_base.isra.19
     0.08%  cyclictest  [kernel.vmlinux]    [k] update_blocked_averages
     0.08%  cyclictest  [kernel.vmlinux]    [k] ftrace_trace_userstack
     0.07%  cyclictest  [kernel.vmlinux]    [k] try_to_unmap_one
     0.07%  cyclictest  [kernel.vmlinux]    [k] tracing_record_taskinfo_skip
     0.07%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c38
     0.07%  cyclictest  [kernel.vmlinux]    [k] __intel_pmu_disable_all
     0.07%  cyclictest  [kernel.vmlinux]    [k] free_pcppages_bulk
     0.07%  cyclictest  [kernel.vmlinux]    [k] task_ctx_sched_out
     0.07%  cyclictest  [kernel.vmlinux]    [k] __msecs_to_jiffies
     0.07%  cyclictest  [vdso]              [.] 0x0000000000000987
     0.07%  cyclictest  [kernel.vmlinux]    [k] update_rt_migration
     0.06%  cyclictest  [kernel.vmlinux]    [k] reweight_entity
     0.06%  cyclictest  [kernel.vmlinux]    [k] dequeue_task_fair
     0.06%  cyclictest  [kernel.vmlinux]    [k] update_load_avg
     0.06%  cyclictest  [kernel.vmlinux]    [k] trace_buffer_unlock_commit_regs
     0.06%  cyclictest  [kernel.vmlinux]    [k] trace_event_raw_event_sched_stat
     0.05%  cyclictest  [kernel.vmlinux]    [k] update_curr
     0.05%  cyclictest  cyclictest          [.] clock_gettime@plt
     0.04%  cyclictest  [vdso]              [.] 0x000000000000097d
     0.04%  cyclictest  [kernel.vmlinux]    [k] rcu_check_callbacks
     0.04%  cyclictest  [kernel.vmlinux]    [k] load_balance
     0.03%  cyclictest  [kernel.vmlinux]    [k] perf_pmu_disable.part.98
     0.03%  cyclictest  [kernel.vmlinux]    [k] __update_load_avg_se.isra.41
     0.03%  cyclictest  [kernel.vmlinux]    [k] __calc_delta
     0.03%  cyclictest  [kernel.vmlinux]    [k] futex_wake
     0.03%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c0d
     0.03%  cyclictest  cyclictest          [.] nanosleep@plt
     0.02%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c09
     0.02%  cyclictest  [kernel.vmlinux]    [k] set_next_entity
     0.02%  cyclictest  [kernel.vmlinux]    [k] switch_mm_irqs_off
     0.02%  cyclictest  [kernel.vmlinux]    [k] balance_runtime
     0.02%  cyclictest  [kernel.vmlinux]    [k] put_prev_task_fair
     0.02%  cyclictest  [kernel.vmlinux]    [k] find_task_by_vpid
     0.02%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c12
     0.02%  cyclictest  [kernel.vmlinux]    [k] kfree
     0.02%  cyclictest  [kernel.vmlinux]    [k] put_prev_entity
     0.01%  cyclictest  [kernel.vmlinux]    [k] rb_next
     0.01%  cyclictest  [kernel.vmlinux]    [k] ring_buffer_time_stamp_abs
     0.01%  cyclictest  [kernel.vmlinux]    [k] update_min_vruntime
     0.01%  cyclictest  [kernel.vmlinux]    [k] enqueue_top_rt_rq
     0.01%  cyclictest  [kernel.vmlinux]    [k] __kmalloc_node
     0.01%  cyclictest  libpthread-2.23.so  [.] 0x0000000000010c30
     0.01%  cyclictest  [kernel.vmlinux]    [k] ktime_get_update_offsets_now
     0.01%  cyclictest  [kernel.vmlinux]    [k] check_cfs_rq_runtime
     0.01%  cyclictest  [kernel.vmlinux]    [k] cpu_load_update
     0.01%  cyclictest  [kernel.vmlinux]    [k] flush_signal_handlers
     0.01%  cyclictest  [kernel.vmlinux]    [k] page_counter_uncharge
     0.00%  cyclictest  [kernel.vmlinux]    [k] end_repeat_nmi
     0.00%  cyclictest  [kernel.vmlinux]    [k] nmi_cpu_backtrace_handler
     0.00%  cyclictest  [kernel.vmlinux]    [k] nmi_restore
     0.00%  cyclictest  [kernel.vmlinux]    [k] nmi_cpu_backtrace
     0.00%  cyclictest  [kernel.vmlinux]    [k] nmi_handle
     0.00%  perf        [kernel.vmlinux]    [k] native_sched_clock
     0.00%  cyclictest  [kernel.vmlinux]    [k] intel_pmu_handle_irq
     0.00%  cyclictest  [kernel.vmlinux]    [k] perf_event_nmi_handler
     0.00%  cyclictest  [kernel.vmlinux]    [k] native_apic_msr_write
     0.00%  perf        [kernel.vmlinux]    [k] native_write_msr

 
> One thing that I notice too is that we have trace_save_cmdline that
> also gets activated when enabling trace events. Perhaps I should try to
> optimize that more.
>>
>> I have been running 20 times the following test, and thrown away the first results:
>>
>> $ sudo ./cyclictest --smp -p95 -m -s -N -l 100000 -q
>>
>> $ uname -a
>> Linux claudio-HP-ProBook-470-G5 4.18.0-rc3+ #3 SMP Tue Jul 3 15:50:30 CEST 2018 x86_64 x86_64 x86_64 GNU/Linux
>>
>> For brevity, this is a comparison of one test's results. All other test results show the same ~25% increase.
>>
>> On the left side, the run without ftrace sched events, on the right side with ftrace sched events enabled.
>>
>> CPU    Count      Min       Act        Avg        Max        Count      Min-ftrace Act-ftrace Avg-ftrace Max-ftrace
>> 0     100000      2339       2936       2841     139478     100000       2900       3182       3566      93056
>> 1      66742      2365       3386       2874      93639      66750       2959       3786       3646     154074
>> 2      50080      2376       3058       2910     196221      50097       2997       4209       3655      18707
>> 3      40076      2394       3461       2931      17914      40091       3006       4417       3750      17159
>> 4      33404      2371       3612       2834      15336      33419       2997       3836       3594      23172
>> 5      28635      2387       3313       2885      25863      28649       2995       3795       3647       9956
>> 6      25058      2384       3428       2968      12162      25071       3051       4366       3719      18151
>> 7      22275      2381       2859       2982      10706      22287       3046       5078       3825      10781
>>
>> I would be thankful for any advice or comments on this,
>> especially with the goal in mind to lower as much as possible the runtime impact on the system.
> 
> But still, we use all sorts of trace events when dealing with real
> time, and trace events hardly shows anything in the PREEMPT_RT kernel.
> Where cyclictest shows 20 microseconds, and sched events hardly touches
> that. But I see this weird side effect on a distro kernel too. Maybe
> there's something weird going on when we have CONFIG_VOLUNTARY_PREEMPT
> set :-/

If it helps, I tested also CONFIG_PREEMPT_NONE=y and CONFIG_PREEMPT=y,
with the same relative result (~25% increase).

> 
> I'll investigate further.

If stress-ng --switch-ops is of any help, I have some results for that as well.

This is with CONFIG_PREEMPT_NONE:

SCHED EVENTS DISABLED

$ stress-ng --times -s 8 --switch-ops 20000000

Test  Runtime(s) User(%) Sys(%) Total(%) bogo ops/s
   1       22.56    6.33  92.96    99.29     886525
   2       23.09    6.34  92.34    98.68     866176
   3       23.54    6.35  89.02    95.37     849618
   4       22.19    6.56  92.73    99.29     901306
   5       22.69    6.41  92.77    99.18     881445

SCHED EVENTS ENABLED

$ stress-ng --times -s 8 --switch-ops 20000000

Test  Runtime(s) User(%) Sys(%) Total(%) bogo ops/s
   1       36.28    4.12  95.31    99.43     551268
   2       37.61    4.07  95.18    99.25     531773
   3       37.96    4.09  94.94    99.03     526870
   4       38.17    4.01  94.97    98.99     523972
   5       37.47    4.11  94.92    99.03     533760

This is with CONFIG_PREEMPT:

SCHED EVENTS DISABLED

$ stress-ng --times -s 8 --switch-ops 20000000

Test  Runtime(s) User(%) Sys(%) Total(%) bogo ops/s
   1      25.73    5.65   93.28  98.93   777303
   2      25.74    5.68   93.14  98.82   777001
   3      24.84    5.88   92.95  98.83   805153
   4      25.12    5.69   93.06  98.75   796178
   5      26.04    5.58   92.79  98.37   768049

SCHED EVENTS ENABLED

Test  Runtime(s) User(%) Sys(%) Total(%) bogo ops/s
   1  	  41.70	   3.72	  95.90	 99.62	 479616
   2      42.91	   3.59	  95.52	 99.11 	 466092
   3  	  43.24	   3.48	  95.57	 99.05	 462535
   4      43.04	   3.46	  95.83	 99.30	 464684
   5  	  42.62	   3.60	  95.75	 99.34 	 469263

This is with CONFIG_VOLUNTARY_PREEMPT:

SCHED EVENTS DISABLED

$ stress-ng --times -s 8 --switch-ops 20000000

Test  Runtime(s) User(%) Sys(%) Total(%) bogo ops/s
   1      23.27    6.24  91.41   97.64   859476
   2      23.51    6.16  92.70   98.86   850702
   3      23.47    6.42  92.62   99.04   852152
   4      23.92    6.34  92.47   98.81   836120
   5      23.63    6.28  92.77   99.05   846382

SCHED EVENTS ENABLED

Test  Runtime(s) User(%) Sys(%) Total(%) bogo ops/s
   1   	  38.65	   3.82	 95.12	 98.94	 517464
   2  	  38.77	   4.27	 94.77	 99.03	 515863
   3  	  38.70	   3.96	 95.29	 99.25	 516796
   4  	  38.52	   4.06	 95.32	 99.38	 519211
   5  	  38.58	   4.24	 94.71	 98.95	 518403

> Thanks for the report.
> 
> -- Steve
> 

Thank you,

I will try to get more familiar with the internals
and then investigate as well.

Claudio

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ