lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 11 Nov 2016 10:27:01 +0800 From: Jason Wang <jasowang@...hat.com> To: "Michael S. Tsirkin" <mst@...hat.com> Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org Subject: Re: [PATCH 3/3] vhost_net: tx support batching On 2016年11月10日 04:05, Michael S. Tsirkin wrote: > On Wed, Nov 09, 2016 at 03:38:33PM +0800, Jason Wang wrote: >> This patch tries to utilize tuntap rx batching by peeking the tx >> virtqueue during transmission, if there's more available buffers in >> the virtqueue, set MSG_MORE flag for a hint for tuntap to batch the >> packets. The maximum number of batched tx packets were specified >> through a module parameter: tx_bached. >> >> When use 16 as tx_batched: > When using > >> Pktgen test shows 16% on tx pps in guest. >> Netperf test does not show obvious regression. > Why doesn't netperf benefit? This is probably because the tests (4VCPU, 1queue, TCP, mlx4) does not produce 100% stress on vhost thread. In pktgen test, 100% stress on vhost thread is achieved easily. > >> For safety, 1 were used as the default value for tx_batched. > s/were used/is used/ > >> Signed-off-by: Jason Wang <jasowang@...hat.com> > These tests unfortunately only run a single flow. > The concern would be whether this increases latency when > NIC is busy with other flows, so I think this is what > you need to test. Multiple flows were tested too, no obvious improvement/regression were found. > > >> --- >> drivers/vhost/net.c | 15 ++++++++++++++- >> drivers/vhost/vhost.c | 1 + >> drivers/vhost/vhost.h | 1 + >> 3 files changed, 16 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c >> index 5dc128a..51c378e 100644 >> --- a/drivers/vhost/net.c >> +++ b/drivers/vhost/net.c >> @@ -35,6 +35,10 @@ module_param(experimental_zcopytx, int, 0444); >> MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;" >> " 1 -Enable; 0 - Disable"); >> >> +static int tx_batched = 1; >> +module_param(tx_batched, int, 0444); >> +MODULE_PARM_DESC(tx_batched, "Number of patches batched in TX"); >> + >> /* Max number of bytes transferred before requeueing the job. >> * Using this limit prevents one virtqueue from starving others. */ >> #define VHOST_NET_WEIGHT 0x80000 > I think we should do some tests and find a good default. Ok, will test 4 and 32 to see if there's any difference. (Btw, 16 were chosed since dpdk tends to batch 16 packet during TX). Thanks
Powered by blists - more mailing lists