[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKA=qzbZ4x5u1bBM0gAECeMNutyaVzp1FmRPwKHsBEEs7984ag@mail.gmail.com>
Date: Wed, 28 Dec 2011 17:23:07 -0600
From: Josh Hunt <joshhunt00@...il.com>
To: davem@...emloft.net, kuznet@....inr.ac.ru, jmorris@...ei.org,
yoshfuji@...ux-ipv6.org, kaber@...sh.net, netdev@...r.kernel.org
Cc: linux-kernel@...r.kernel.org
Subject: [PATCH RFC] IPv6: Avoid taking write lock for /proc/net/ipv6_route
During some debugging I needed to look into how /proc/net/ipv6_route
operated and in my digging I found its calling fib6_clean_all() which
uses "write_lock_bh(&table->tb6_lock)" before doing the walk of the
table. I saw this on 2.6.32, but reading the code I believe the same
basic idea exists in the current code. Looking at the rtnetlink code
they are only calling "read_lock_bh(&table->tb6_lock);" via
fib6_dump_table(). While I realize reading from proc probably isn't
the recommended way of fetching the ipv6 route table; taking a write
lock seems unnecessary and would probably cause network performance
issues.
To verify this I loaded up the ipv6 route table and then ran iperf in 3 cases:
* doing nothing
* reading ipv6 route table via proc (while :; do cat
/proc/net/ipv6_route > /dev/null; done)
* reading ipv6 route table via rtnetlink - (while :; do ip -6 route
show table all > /dev/null; done)
* Load the ipv6 route table up with:
* for ((i = 0;i < 4000;i++)); do ip route add unreachable 2000::$i; done
* iperf commands:
* client: iperf -i 1 -V -c <ipv6 addr>
* server: iperf -V -s
* iperf results - 3 runs as client, 3 runs as server (in Mbits/sec)
* nothing: client: 927,927,927 server: 927,927,927
* proc: client: 179,97,96,113 server: 142,112,133
* iproute: client: 928,927,928 server: 927,927,927
lock_stat shows taking the write lock is causing the slowdown. Using
this info I decided to write a version of fib6_clean_all() which
replaces write_lock_bh(&table->tb6_lock) with
read_lock_bh(&table->tb6_lock). With this new function I see the same
results as with my rtnetlink iperf test. I guess my question is what
am I missing? Is there a reason you need to take the write lock when
reading the route table to display to proc?
I attached a patch with my crude method listed above.
Thanks
Josh
View attachment "ipv6-avoid-taking-write-lock-for-proc-net-ipv6_route.patch" of type "text/x-diff" (3796 bytes)
Powered by blists - more mailing lists