[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1412250327.16704.84.camel@edumazet-glaptop2.roam.corp.google.com>
Date: Thu, 02 Oct 2014 04:45:27 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Amir Vadai <amirv@...lanox.com>
Cc: Or Gerlitz <gerlitz.or@...il.com>,
Alexei Starovoitov <ast@...mgrid.com>,
"David S. Miller" <davem@...emloft.net>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Eric Dumazet <edumazet@...gle.com>,
John Fastabend <john.r.fastabend@...el.com>,
Linux Netdev List <netdev@...r.kernel.org>,
Or Gerlitz <or.gerlitz@...il.com>, amira@...lanox.com,
idos@...lanox.com, Yevgeny Petrilin <yevgenyp@...lanox.com>,
eyalpe@...lanox.com
Subject: Re: [PATCH v2 net-next] mlx4: optimize xmit path
On Thu, 2014-10-02 at 11:03 +0300, Amir Vadai wrote:
> Hi,
>
> Will take it into the split patchset - we just hit this bug when tried
> to run benchmarks with blueflame disabled (easy to test by using ethtool
> priv flag blueflame).
Hmm, I do not know this ethtool command, please share ;)
>
> I'm still working on it, but I can't reproduce the numbers that you
> show. On my development machine, I get ~5.5Mpps with burst=8 and ~2Mpps
> with burst=1.
You have to be careful with the 'clone X' : If you choose a too big
value, TX completion competes with the sender thread.
>
> In addition, I see no improvements when adding the optimization to the
> xmit path.
> I use the net-next kernel + pktgen burst support patch, with and without
> this xmit path optimization patch.
>
> Do you use other patches not upstream in your environment?
Nope, this is with David net-next tree.
> Can you share the .config/pktgen configuration?
Sure.
>
> One other note: we're checking now that blueflame could be used with
> xmit_more. It might result with packets reordering/drops. Still under
> investigation.
I noticed no reorders. I tweaked the stack to force a gso segmentation
(in software) instead of using NIC TSO for small packets (2 or 3 MSS)
200 concurrent netperf -t TCP_RR -- -r 2000,2000 performance was
increased by ~100%.
#!/bin/bash
#
# on the destination, drop packets with
# iptables -A PREROUTING -t raw -p udp --dport 9 -j DROP
# Or run a recent enough kernel with global ICMP rate limiting to 1000 packets/sec
# ( http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=4cdf507d54525842dfd9f6313fdafba039084046 )
#
#### Configure
# Yeah, if you use PKTSIZE <= 104, performance is lower because of inline (copy whole frame content into tx desc)
PKTSIZE=105
echo "pktrate: $PKTRATE"
COUNT=20000000000
RUN_SECS=60
SRC_DEV=eth0
SRC_IP_MIN=7.0.0.1
SRC_IP_MAX=7.255.255.255
SRC_MAC=00:1a:11:c3:0d:7f
DST_IP=10.246.7.152
DST_MAC=00:1a:11:c3:0d:45
DST_UDP=9
## END OF CONFIGURATION OPTIONS
#### Helper
## Configuration procfs inodes
DEV_INODE=/proc/net/pktgen/$SRC_DEV
MAIN_INODE=/proc/net/pktgen/pgctrl
THREAD_INODE=/proc/net/pktgen/kpktgend_2
# write to a procfs file
function pgset_ex()
{
local result
echo $2
echo $2 > $1
result=`cat $1 | fgrep "Result: OK:"`
if [ "$result" = "" ]; then
cat $1 | fgrep Result:
fi
}
#### Pre: configure
# attach device exclusively
pgset_ex $THREAD_INODE "rem_device_all"
pgset_ex $THREAD_INODE "add_device $SRC_DEV"
# configure basics
pgset_ex $DEV_INODE "clone_skb 8"
pgset_ex $DEV_INODE "src_min $SRC_IP_MIN"
pgset_ex $DEV_INODE "src_max $SRC_IP_MAX"
pgset_ex $DEV_INODE "dst $DST_IP"
pgset_ex $DEV_INODE "dst_mac $DST_MAC"
pgset_ex $DEV_INODE "udp_dst_min $DST_UDP"
pgset_ex $DEV_INODE "udp_dst_max $DST_UDP"
pgset_ex $DEV_INODE "queue_map_min 0"
pgset_ex $DEV_INODE "queue_map_max 0"
pgset_ex $DEV_INODE "burst 8"
pgset_ex $DEV_INODE "pkt_size $PKTSIZE"
pgset_ex $DEV_INODE "delay 0"
# reset to continuous transmission
pgset_ex $DEV_INODE "count $COUNT"
#### Run: transmit
echo -e "UDP packet generator (based on linux pktgen)\n"
echo -e " src: mac=$SRC_MAC ip=$SRC_IP dev=$SRC_DEV"
echo -e " dest: mac=$DST_MAC ip=$DST_IP port=$DST_UDP\n"
modprobe pktgen
#ethtool -C eth0 tx-usecs 16 tx-frames 16
#ethtool -C eth1 tx-usecs 16 tx-frames 16
# start thread(s)
# the write will block until Ctrl^C is pressed or a timeout kills the write
echo "Running for $RUN_SECS seconds"
#pgset_ex $MAIN_INODE "start"
echo "start" > $MAIN_INODE 2>/dev/null &
sleep $RUN_SECS
echo $DEV_INODE
cat $DEV_INODE
# stop
kill $!
pgset_ex $MAIN_INODE "stop"
echo "OK. All done"
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists