lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <57A34448.1040600@kyup.com>
Date:	Thu, 4 Aug 2016 16:34:00 +0300
From:	Nikolay Borisov <kernel@...p.com>
To:	Erez Shitrit <erezsh@....mellanox.co.il>
Cc:	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
	netdev@...r.kernel.org
Subject: Slow veth performance over ipoib interface on 4.7.0 (and earlier)
 (Was Re: [IPOIB] Excessive TX packet drops due to IPOIB_MAX_PATH_REC_QUEUE)



On 08/01/2016 11:56 AM, Erez Shitrit wrote:
> The GID (9000:0:2800:0:bc00:7500:6e:d8a4) is not regular, not from
> local subnet prefix.
> why is that?
>

So I managed to debug this and it tuns out the problem lies between veth
and ipoib interaction:

I've discovered the following strange thing. If I have a vethpair where
the 2 devices are in a different net namespaces as shown in the scripts
I have attached then the performance of sending a file, originating from
the veth interface inside the non-init netnamespace, going across the
ipoib interface is very slow (100kb). For simple reproduction I'm attaching
2 scripts which have to be run on 2 machine and the respective ip addresses
set on them. Then sending node woult initiate a simple file copy over NC.
I've observed this behavior on upstream 4.4, 4.5.4 and 4.7.0 kernels both
with ipv4 and ipv6 addresses. Here is what the debug log of the ipoib
module shows:

ib%d: max_srq_sge=128
ib%d: max_cm_mtu = 0xfff0, num_frags=16
ib0: enabling connected mode will cause multicast packet drops
ib0: mtu > 4092 will cause multicast packet drops.
ib0: bringing up interface
ib0: starting multicast thread
ib0: joining MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff
ib0: restarting multicast task
ib0: adding multicast entry for mgid ff12:601b:ffff:0000:0000:0000:0000:0001
ib0: restarting multicast task
ib0: adding multicast entry for mgid ff12:401b:ffff:0000:0000:0000:0000:0001
ib0: join completion for ff12:401b:ffff:0000:0000:0000:ffff:ffff (status 0)
ib0: Created ah ffff88081063ea80
ib0: MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff AV ffff88081063ea80, LID 0xc000, SL 0
ib0: joining MGID ff12:601b:ffff:0000:0000:0000:0000:0001
ib0: joining MGID ff12:401b:ffff:0000:0000:0000:0000:0001
ib0: successfully started all multicast joins
ib0: join completion for ff12:601b:ffff:0000:0000:0000:0000:0001 (status 0)
ib0: Created ah ffff880839084680
ib0: MGID ff12:601b:ffff:0000:0000:0000:0000:0001 AV ffff880839084680, LID 0xc002, SL 0
ib0: join completion for ff12:401b:ffff:0000:0000:0000:0000:0001 (status 0)
ib0: Created ah ffff88081063e280
ib0: MGID ff12:401b:ffff:0000:0000:0000:0000:0001 AV ffff88081063e280, LID 0xc004, SL 0

When the transfer is initiated I can see the following errors
on the sending node:

ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36

Here is the port guid of the sending node: 0x0011750000772664 and
on the receiving one: 0x0011750000774d36

Here is how the paths look like on the sending node, 
clearly the paths being requested from the veth interface

cat /sys/kernel/debug/ipoib/ib0_path
GID: 401:0:1400:0:a0a8:ffff:1c01:4d36
complete: no

GID: 401:0:1400:0:a410:ffff:1c01:4d36
complete: no

GID: fe80:0:0:0:11:7500:77:2a1a
complete: yes
DLID: 0x0004
SL: 0
rate: 40.0 Gb/sec

GID: fe80:0:0:0:11:7500:77:4d36
complete: yes
DLID: 0x000a
SL: 0
rate: 40.0 Gb/sec

Testing the same scenario but instead of using veth devices I create
the device in the non-init netnamespace via the following commands
I can achieve sensible speeds:
ip link add link ib0 name ip1 type ipoib
ip link set dev ip1 netns test-netnamespace




 
[Snipped a lot of useless stuff]

Download attachment "receive-node.sh" of type "application/x-shellscript" (181 bytes)

Download attachment "sending-node.sh" of type "application/x-shellscript" (806 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ