lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9D2C22909C6E774EBFB8B5583AE5291C019F9B16@fmsmsx414.amr.corp.intel.com>
Date:	Fri, 29 Dec 2006 15:52:26 -0800
From:	"Chen, Tim C" <tim.c.chen@...el.com>
To:	"Ingo Molnar" <mingo@...e.hu>
Cc:	<linux-kernel@...r.kernel.org>,
	"Siddha, Suresh B" <suresh.b.siddha@...el.com>,
	"Peter Zijlstra" <a.p.zijlstra@...llo.nl>
Subject: RE: 2.6.19-rt14 slowdown compared to 2.6.19

Ingo Molnar wrote:
> 
> If you'd like to profile this yourself then the lowest-cost way of
> profiling lock contention on -rt is to use the yum kernel and run the
> attached trace-it-lock-prof.c code on the box while your workload is
> in 'steady state' (and is showing those extended idle times):
> 
>   ./trace-it-lock-prof > trace.txt
> 
> this captures up to 1 second worth of system activity, on the current
> CPU. Then you can construct the histogram via:
> 
>   grep -A 1 ' __schedule()<-' trace.txt | cut -d: -f2- | sort |
>                                   uniq -c | sort -n > prof.txt
> 

I did lock profiling on Volanomark as suggested and obtained the 
profile that is listed below. 

    246
__sched_text_start()<-schedule()<-rt_spin_lock_slowlock()<-__lock_text_s
tart()
    264  rt_mutex_slowunlock()<-rt_mutex_unlock()<-rt_up_read()<-(-1)()
    334
__sched_text_start()<-schedule()<-posix_cpu_timers_thread()<-kthread()
    437  __sched_text_start()<-schedule()<-do_futex()<-sys_futex()
    467  (-1)()<-(0)()<-(0)()<-(0)()
    495
__sched_text_start()<-preempt_schedule()<-__spin_unlock_irqrestore()<-rt
_mutex_adjust_prio()
    497  __netif_rx_schedule()<-netif_rx()<-loopback_xmit()<-(-1)()
    499
__sched_text_start()<-schedule()<-schedule_timeout()<-sk_wait_data()
    500  tcp_recvmsg()<-sock_common_recvmsg()<-sock_recvmsg()<-(-1)()
    503  __rt_down_read()<-rt_down_read()<-do_futex()<-(-1)()
   1160  __sched_text_start()<-schedule()<-ksoftirqd()<-kthread()
   1433  __rt_down_read()<-rt_down_read()<-futex_wake()<-(-1)()
   1497  child_rip()<-(-1)()<-(0)()<-(0)()
   1936
__sched_text_start()<-schedule()<-rt_mutex_slowlock()<-rt_mutex_lock()

Looks like the idle time I saw was due to lock contention 
during call to futex_wake, which requires acquisition of
current->mm->mmap_sem. 
Many of the java threads share mm and result in concurrent access to
common mm.  
Looks like under rt case there is no special treatment to read locking
so
the read lock accesses are contended under __rt_down_read.  For non rt
case, 
__down_read makes the distinction for read lock access and the read
lockings 
do not contend. 

Things are made worse here as this delayed waking up processes locked by
the futex.
See also a snippet of the latency_trace below. 

  <idle>-0     2D..2 5821us!: thread_return <softirq--31> (150 20)
  <idle>-0     2DN.1 6278us :
__sched_text_start()<-cpu_idle()<-start_secondary()<-(-1)()
  <idle>-0     2DN.1 6278us : (0)()<-(0)()<-(0)()<-(0)()
    java-6648  2D..2 6280us+: thread_return <<idle>-0> (20 -4)
    java-6648  2D..1 6296us :
try_to_wake_up()<-wake_up_process()<-wakeup_next_waiter()<-rt_mutex_slow
unlock()
    java-6648  2D..1 6296us :
rt_mutex_unlock()<-rt_up_read()<-do_futex()<-(-1)()
    java-6648  2D..2 6297us : effective_prio <<...>-6673> (-4 -4)
    java-6648  2D..2 6297us : __activate_task <<...>-6673> (-4 1)
    java-6648  2.... 6297us < (-11)
    java-6648  2.... 6298us+> sys_futex (0000000000afaf50
0000000000000001 0000000000000001)
    java-6648  2...1 6315us :
__sched_text_start()<-schedule()<-rt_mutex_slowlock()<-rt_mutex_lock()
    java-6648  2...1 6315us :
__rt_down_read()<-rt_down_read()<-futex_wake()<-(-1)()
    java-6648  2D..2 6316us+: deactivate_task <java-6648> (-4 1)
  <idle>-0     2D..2 6318us+: thread_return <java-6648> (-4 20)
  <idle>-0     2DN.1 6327us :
__sched_text_start()<-cpu_idle()<-start_secondary()<-(-1)()
  <idle>-0     2DN.1 6328us+: (0)()<-(0)()<-(0)()<-(0)()
    java-6629  2D..2 6330us+: thread_return <<idle>-0> (20 -4)
    java-6629  2D..1 6347us :
try_to_wake_up()<-wake_up_process()<-wakeup_next_waiter()<-rt_mutex_slow
unlock()
    java-6629  2D..1 6347us :
rt_mutex_unlock()<-rt_up_read()<-futex_wake()<-(-1)()
    java-6629  2D..2 6348us : effective_prio <java-6235> (-4 -4)
    java-6629  2D..2 6349us : __activate_task <java-6235> (-4 1)
    java-6629  2.... 6350us+< (0)
    java-6629  2.... 6352us+> sys_futex (0000000000afc1dc
0000000000000001 0000000000000001)
    java-6629  2...1 6368us :
__sched_text_start()<-schedule()<-rt_mutex_slowlock()<-rt_mutex_lock()
    java-6629  2...1 6368us :
__rt_down_read()<-rt_down_read()<-futex_wake()<-(-1)()
    java-6629  2D..2 6369us+: deactivate_task <java-6629> (-4 1)
  <idle>-0     2D..2 6404us!: thread_return <java-6629> (-4 20)
  <idle>-0     2DN.1 6584us :
__sched_text_start()<-cpu_idle()<-start_secondary()<-(-1)()

Thanks.

Tim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ