netdev - Re: Understanding lock contention in __udp4_lib_mcast

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130627224444.GE5936@sbohrermbp13-local.rgmadvisors.com>
Date:	Thu, 27 Jun 2013 17:44:44 -0500
From:	Shawn Bohrer <sbohrer@...advisors.com>
To:	Rick Jones <rick.jones2@...com>
Cc:	netdev@...r.kernel.org
Subject: Re: Understanding lock contention in __udp4_lib_mcast_deliver

On Thu, Jun 27, 2013 at 03:03:15PM -0700, Rick Jones wrote:
> On 06/27/2013 02:54 PM, Shawn Bohrer wrote:
> >On Thu, Jun 27, 2013 at 01:46:58PM -0700, Rick Jones wrote:
> >>How do you know that time is actually contention and not simply
> >>acquire and release overhead?
> >
> >Excellent point, and that could be the problem with my thinking.  I
> >just now tried (unsuccessfully) to use lockstat to see if there was
> >any contention reported.  I read Documentation/lockstat.txt and
> >followed the instructions but the lock in question did not appear to
> >be in the output.  I think I'm going to have to go with the assumption
> >that this is just acquire and release overhead.
> 
> I think there is a way to get perf to "annotate" (iirc that is the
> term it uses) the report to show hits at the instruction level.
> Ostensibly one could then look and see how many of the hits were for
> the acquire/release part of the routine, and how much was for the
> actual contention.

Yep, so ~1% of my total time is in _raw_spin_lock and using perf
annotate it appears that maybe only 5-6% percent of that is actually
contention and the rest is acquire/release.  Looks like I need to look
elsewhere for my performance improvements.  Thanks Rick for your help!
Below is the output of perf annotate if your curious.

 Percent |      Source code & Disassembly of vmlinux
------------------------------------------------
         :
         :
         :
         :      Disassembly of section .text:
         :
         :      ffffffff814c72d0 <_raw_spin_lock>:
         :      EXPORT_SYMBOL(_raw_spin_trylock_bh);
         :      #endif
         :
         :      #ifndef CONFIG_INLINE_SPIN_LOCK
         :      void __lockfunc _raw_spin_lock(raw_spinlock_t *lock)
         :      {
    2.43 :      ffffffff814c72d0:       callq  ffffffff814cf440 <__fentry__>
    1.23 :      ffffffff814c72d5:       push   %rbp
    1.66 :      ffffffff814c72d6:       mov    %rsp,%rbp
         :       */
         :      static __always_inline void __ticket_spin_lock(arch_spinlock_t *lock)
         :      {
         :              register struct __raw_tickets inc = { .tail = 1 };
         :
         :              inc = xadd(&lock->tickets, inc);
    0.71 :      ffffffff814c72d9:       mov    $0x10000,%eax
    0.00 :      ffffffff814c72de:       lock xadd %eax,(%rdi)
   86.07 :      ffffffff814c72e2:       mov    %eax,%edx
    0.05 :      ffffffff814c72e4:       shr    $0x10,%edx
         :
         :              for (;;) {
         :                      if (inc.head == inc.tail)
    0.00 :      ffffffff814c72e7:       cmp    %ax,%dx
    0.00 :      ffffffff814c72ea:       je     ffffffff814c72fa <_raw_spin_lock+0x2a>
    0.04 :      ffffffff814c72ec:       nopl   0x0(%rax)
         :      }
         :
         :      /* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
         :      static inline void rep_nop(void)
         :      {
         :              asm volatile("rep; nop" ::: "memory");
    0.47 :      ffffffff814c72f0:       pause  
         :                              break;
         :                      cpu_relax();
         :                      inc.head = ACCESS_ONCE(lock->tickets.head);
    2.85 :      ffffffff814c72f2:       movzwl (%rdi),%eax
         :              register struct __raw_tickets inc = { .tail = 1 };
         :
         :              inc = xadd(&lock->tickets, inc);
         :
         :              for (;;) {
         :                      if (inc.head == inc.tail)
    3.53 :      ffffffff814c72f5:       cmp    %ax,%dx
    0.00 :      ffffffff814c72f8:       jne    ffffffff814c72f0 <_raw_spin_lock+0x20>
         :              __raw_spin_lock(lock);
         :      }
    0.91 :      ffffffff814c72fa:       pop    %rbp
    0.00 :      ffffffff814c72fb:       retq   

--
Shawn

-- 

---------------------------------------------------------------
This email, along with any attachments, is confidential. If you 
believe you received this message in error, please contact the 
sender immediately and delete all copies of the message.  
Thank you.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html