netdev - [PATCH RFC] IPv6: Avoid taking write lock for /proc/net/ipv6

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKA=qzbZ4x5u1bBM0gAECeMNutyaVzp1FmRPwKHsBEEs7984ag@mail.gmail.com>
Date:	Wed, 28 Dec 2011 17:23:07 -0600
From:	Josh Hunt <joshhunt00@...il.com>
To:	davem@...emloft.net, kuznet@....inr.ac.ru, jmorris@...ei.org,
	yoshfuji@...ux-ipv6.org, kaber@...sh.net, netdev@...r.kernel.org
Cc:	linux-kernel@...r.kernel.org
Subject: [PATCH RFC] IPv6: Avoid taking write lock for /proc/net/ipv6_route

During some debugging I needed to look into how /proc/net/ipv6_route
operated and in my digging I found its calling fib6_clean_all() which
uses "write_lock_bh(&table->tb6_lock)" before doing the walk of the
table. I saw this on 2.6.32, but reading the code I believe the same
basic idea exists in the current code. Looking at the rtnetlink code
they are only calling "read_lock_bh(&table->tb6_lock);" via
fib6_dump_table(). While I realize reading from proc probably isn't
the recommended way of fetching the ipv6 route table; taking a write
lock seems unnecessary and would probably cause network performance
issues.

To verify this I loaded up the ipv6 route table and then ran iperf in 3 cases:
  * doing nothing
  * reading ipv6 route table via proc (while :; do cat
/proc/net/ipv6_route > /dev/null; done)
  * reading ipv6 route table via rtnetlink - (while :; do ip -6 route
show table all > /dev/null; done)

* Load the ipv6 route table up with:
  * for ((i = 0;i < 4000;i++)); do ip route add unreachable 2000::$i; done

* iperf commands:
  * client: iperf -i 1 -V -c <ipv6 addr>
  * server: iperf -V -s

* iperf results - 3 runs as client, 3 runs as server (in Mbits/sec)
  * nothing: client: 927,927,927 server: 927,927,927
  * proc: client: 179,97,96,113 server: 142,112,133
  * iproute: client: 928,927,928 server: 927,927,927

lock_stat shows taking the write lock is causing the slowdown. Using
this info I decided to write a version of fib6_clean_all() which
replaces write_lock_bh(&table->tb6_lock) with
read_lock_bh(&table->tb6_lock). With this new function I see the same
results as with my rtnetlink iperf test. I guess my question is what
am I missing? Is there a reason you need to take the write lock when
reading the route table to display to proc?

I attached a patch with my crude method listed above.

Thanks
Josh

View attachment "ipv6-avoid-taking-write-lock-for-proc-net-ipv6_route.patch" of type "text/x-diff" (3796 bytes)