lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 9 Mar 2016 19:39:19 +0300
From:	Cyrill Gorcunov <gorcunov@...il.com>
To:	Eric Dumazet <eric.dumazet@...il.com>,
	David Miller <davem@...emloft.net>
Cc:	netdev@...r.kernel.org, solar@...nwall.com, vvs@...tuozzo.com,
	avagin@...tuozzo.com, xemul@...tuozzo.com, vdavydov@...tuozzo.com,
	khorenko@...tuozzo.com
Subject: Re: [RFC] net: ipv4 -- Introduce ifa limit per net

On Sun, Mar 06, 2016 at 08:06:41PM +0300, Cyrill Gorcunov wrote:
> > 
> > Well, this looks like LOCKDEP kernel. Are you really running LOCKDEP on
> > production kernels ?
> 

Hi Eric, David. Sorry for the delay. Finally I've measured the
latency on the hw. It's i7-2600 cpu with 16G of memory. Here
are the collected data.

---
Unpatched vanilla
=================

commit 7f02bf6b5f5de90b7a331759b5364e41c0f39bf9
Author: Linus Torvalds <torvalds@...ux-foundation.org>
Date:   Tue Mar 8 09:41:20 2016 -0800

 Creating new addresses
 ----------------------
  19.26%  [kernel]                      [k] check_lifetime
  13.88%  [kernel]                      [k] __inet_insert_ifa
  13.01%  [kernel]                      [k] inet_rtm_newaddr

 Release
 -------
  20.96%  [kernel]                    [k] _raw_spin_lock
  17.79%  [kernel]                    [k] preempt_count_add
  14.79%  [kernel]                    [k] __local_bh_enable_ip
  13.08%  [kernel]                    [k] preempt_count_sub
   9.21%  [kernel]                    [k] nf_ct_iterate_cleanup
   3.15%  [kernel]                    [k] _raw_spin_unlock
   2.80%  [kernel]                    [k] nf_conntrack_lock
   2.67%  [kernel]                    [k] in_lock_functions
   2.63%  [kernel]                    [k] get_parent_ip
   2.26%  [kernel]                    [k] __inet_del_ifa
   2.17%  [kernel]                    [k] fib_del_ifaddr
   1.77%  [kernel]                    [k] _cond_resched

[root@...5 ~]# ./exploit.sh
START 4		addresses STOP 1457537580 1457537581
START 2704	addresses STOP 1457537584 1457537589
START 10404	addresses STOP 1457537602 1457537622
START 23104	addresses STOP 1457537657 1457537702
START 40804	addresses STOP 1457537784 1457537867
START 63504	addresses STOP 1457538048 1457538187

Patched (David's two patches)
=============================

 Creating new addresses
 ----------------------
  21.63%  [kernel]                    [k] check_lifetime
  14.31%  [kernel]                    [k] __inet_insert_ifa
  13.47%  [kernel]                    [k] inet_rtm_newaddr
   1.53%  [kernel]                    [k] check_preemption_disabled
   1.38%  [kernel]                    [k] page_fault
   1.27%  [kernel]                    [k] unmap_page_range

 Release
 -------
  24.26%  [kernel]                    [k] _raw_spin_lock
  17.55%  [kernel]                    [k] preempt_count_add
  14.81%  [kernel]                    [k] __local_bh_enable_ip
  14.17%  [kernel]                    [k] preempt_count_sub
  10.10%  [kernel]                    [k] nf_ct_iterate_cleanup
   3.00%  [kernel]                    [k] _raw_spin_unlock
   2.95%  [kernel]                    [k] nf_conntrack_lock
   2.86%  [kernel]                    [k] in_lock_functions
   2.73%  [kernel]                    [k] get_parent_ip
   1.91%  [kernel]                    [k] _cond_resched
   0.39%  [kernel]                    [k] task_tick_fair
   0.27%  [kernel]                    [k] native_write_msr_safe
   0.22%  [kernel]                    [k] rcu_check_callbacks
   0.20%  [kernel]                    [k] check_lifetime
   0.18%  [kernel]                    [k] check_preemption_disabled
   0.16%  [kernel]                    [k] hrtimer_active
   0.13%  [kernel]                    [k] __inet_insert_ifa
   0.13%  [kernel]                    [k] __memmove
   0.13%  [kernel]                    [k] inet_rtm_newaddr

[root@...5 ~]# ./exploit.sh
START 4		addresses STOP 1457539863 1457539864
START 2704	addresses STOP 1457539867 1457539872
START 10404	addresses STOP 1457539885 1457539905
START 23104	addresses STOP 1457539938 1457539980
START 40804	addresses STOP 1457540058 1457540132
START 63504	addresses STOP 1457540305 1457540418
---

The lockdep is turned off. And the script itself is
---
[root@...5 ~]# cat ./exploit.sh 
#!/bin/sh

if [ -z $1 ]; then
	for x in `seq 1 50 255`; do
		echo -n "START "
		(unshare -n /bin/sh exploit.sh $x)
		sleep 1
		for j in `seq 0 100`; do
			ip r > /dev/null
		done
		echo -n " "
		echo `date +%s`
	done
else
	for x in `seq 0 $1`; do
		for y in `seq 0 $1`; do
			ip a a 127.1.$x.$y dev lo
		done
	done
	num=`ip a l dev lo | grep -c "inet "`
	echo -n "$num addresses "
	echo -n "STOP "
	echo -n `date +%s`
	exit
fi
---

Note i run ip r in a cycle and added sleep before. On idle
machine this cycle takes ~1 second. But when run when kernel
cleans up the netnamespace it takea a way longer.

Also here is a graph for the data collected (blue line: unpatched
version, red -- patched. Of course with patched version it become
a way more better but still hanging).

https://docs.google.com/spreadsheets/d/1eyQDxjuZY2DHKYksGACpHDDcV1Bd92e-ZiY8ywPKshA/edit?usp=sharing

The perf output earlier shows the "perf top" when addresses
are created and when they are releasing.

The main problem still I think is that we allow to request
as many inet addresses as there is enough free memory and
of course kernel can't handle all in O(1) time, all resources
must be released so there always be some lagging moment. Thus
maybe introducing limits would be a good idea for sysadmins.

	Cyrill

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ