lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 23 Apr 2013 16:19:23 +0200
From:	Jesper Dangaard Brouer <brouer@...hat.com>
To:	Hannes Frederic Sowa <hannes@...essinduktion.org>
Cc:	Eric Dumazet <eric.dumazet@...il.com>,
	"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [net-next PATCH 2/3] net: fix enforcing of fragment queue hash
 list depth

On Tue, 2013-04-23 at 02:20 +0200, Hannes Frederic Sowa wrote:
> On Mon, Apr 22, 2013 at 06:30:17PM +0200, Jesper Dangaard Brouer wrote:
> > On Mon, 2013-04-22 at 16:54 +0200, Hannes Frederic Sowa wrote:
> > > On Mon, Apr 22, 2013 at 11:10:34AM +0200, Jesper Dangaard Brouer wrote:
> > [...]
> > > > Besides, after we have implemented per hash bucket locking (in my change
> > > > commit 19952cc4 "net: frag queue per hash bucket locking").
> > > > Then, I don't think it is a big problem that a single hash bucket is
> > > > being "attacked".
> > > 
> > > I don't know, I wouldn't say so. The contention point is now the per
> > > hash bucket lock but it should show the same symptoms as before.
> > 
> > No, the contention point is the LRU list lock, not the hash bucket lock.
> > If you perf record/profile the code, you can easily miss that its the
> > LRU lock, because its inlined.  Try to rerun your tests with noinline
> > e.g.:
> 
> It depends on the test. Last time I checked with my ipv6 torture test I
> had most hits in the inet_frag_find loop (I looked at it after your per
> bucket locks landed in net-next). I think your test setup could provide
> more meaningful numbers. If you fill up the whole fragment cache it is
> plausible that the contention will shift towards the lru lock.

Yes, traffic patterns do affect the results, BUT you have to be really
careful profiling this:

Notice, that inet_frag_find() also indirectly takes the LRU lock, and
the perf tool will blame inet_frag_find().  This is very subtle and
happens with a traffic pattern that want to create new frag queues (e.g.
not found in the hash list).
The problem is that inet_frag_find() calls inet_frag_create() (if q is
not found) which calls inet_frag_intern() which calls
inet_frag_lru_add() taking the LRU lock.  All of these functions gets
inlined by the compiler, thus inet_frag_find() gets the blame.


To avoid pissing people off:
Yes, having a long list in the hash bucket is obviously also contributes
significantly.  Yes, we still should increase the hash bucket size.  I'm
just pointing out be careful about what you actually profile ;-)


Please see below, profiling of current next-next, with "noinline" added
to inet_frag_intern, inet_frag_alloc and inet_frag_create.  Run under
test 20G3F+MQ. I hope you can see my point with the LRU list lock,
please let me know if I have missed something.


--Jesper

Profile of net-next with below diff/patch, output from command:
  perf report -Mintel -C 1 --stdio --call-graph graph,2 

Cut down version (full version attached):
-----------------------------------------
# Overhead      Command      Shared Object                                   Symbol
# ........  ...........  .................  .......................................
#
    70.12%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_spin_lock                     
            |
            --- _raw_spin_lock
               |          
               |--19.70%-- ip_defrag
               |--17.36%-- inet_frag_kill
               |          |--15.04%-- inet_frag_evictor
               |           --2.32%-- ip_defrag
               |--17.15%-- inet_frag_intern
                --15.52%-- inet_frag_evictor
      
    9.61%      swapper  [kernel.kallsyms]  [k] _raw_spin_lock                     
                |
                --- _raw_spin_lock
                   |          
                   |--2.46%-- inet_frag_kill
                   |--2.44%-- ip_defrag
                   |--2.38%-- inet_frag_evictor
                    --2.28%-- inet_frag_intern 

     1.91%  ksoftirqd/1  [kernel.kallsyms]  [k] inet_frag_find                     

     1.66%  ksoftirqd/1  [kernel.kallsyms]  [k] inet_frag_intern                   

     1.54%  ksoftirqd/1  [kernel.kallsyms]  [k] ip_defrag                          

     1.52%  ksoftirqd/1  [kernel.kallsyms]  [k] inet_frag_kill                     

     1.19%  ksoftirqd/1  [kernel.kallsyms]  [k] inet_frag_evictor                  

     1.05%  ksoftirqd/1  [kernel.kallsyms]  [k] __percpu_counter_add               

     0.77%  ksoftirqd/1  [kernel.kallsyms]  [k] __list_del_entry                   

     0.53%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_lock                     

     0.49%  ksoftirqd/1  [kernel.kallsyms]  [k] ip4_frag_match                     

     0.44%  ksoftirqd/1  [kernel.kallsyms]  [k] __list_add                         

     0.42%  ksoftirqd/1  [kernel.kallsyms]  [k] inet_getpeer                       


"noinline: changes to net-next:
-------------------------------
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index e97d66a..d899bba 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -241,7 +241,7 @@ int inet_frag_evictor(struct netns_frags *nf, struct inet_frags *f, bool force)
 }
 EXPORT_SYMBOL(inet_frag_evictor);
 
-static struct inet_frag_queue *inet_frag_intern(struct netns_frags *nf,
+static noinline struct inet_frag_queue *inet_frag_intern(struct netns_frags *nf,
                struct inet_frag_queue *qp_in, struct inet_frags *f,
                void *arg)
 {
@@ -289,7 +289,7 @@ static struct inet_frag_queue *inet_frag_intern(struct netns_frags *nf,
        return qp;
 }
 
-static struct inet_frag_queue *inet_frag_alloc(struct netns_frags *nf,
+static noinline struct inet_frag_queue *inet_frag_alloc(struct netns_frags *nf,
                struct inet_frags *f, void *arg)
 {
        struct inet_frag_queue *q;
@@ -309,7 +309,7 @@ static struct inet_frag_queue *inet_frag_alloc(struct netns_frags *nf,
        return q;
 }
 
-static struct inet_frag_queue *inet_frag_create(struct netns_frags *nf,
+static noinline struct inet_frag_queue *inet_frag_create(struct netns_frags *nf,
                struct inet_frags *f, void *arg)
 {
        struct inet_frag_queue *q;



View attachment "perf_noinline_call_graph.out" of type "text/plain" (42607 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ