[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110201055627.GG9124@redhat.com>
Date: Tue, 1 Feb 2011 07:56:27 +0200
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Sridhar Samudrala <sri@...ibm.com>
Cc: Steve Dobbelstein <steved@...ibm.com>,
David Miller <davem@...emloft.net>, kvm@...r.kernel.org,
mashirle@...ux.vnet.ibm.com, netdev@...r.kernel.org
Subject: Re: Network performance with small packets
On Mon, Jan 31, 2011 at 05:30:38PM -0800, Sridhar Samudrala wrote:
> On Mon, 2011-01-31 at 18:24 -0600, Steve Dobbelstein wrote:
> > "Michael S. Tsirkin" <mst@...hat.com> wrote on 01/28/2011 06:16:16 AM:
> >
> > > OK, so thinking about it more, maybe the issue is this:
> > > tx becomes full. We process one request and interrupt the guest,
> > > then it adds one request and the queue is full again.
> > >
> > > Maybe the following will help it stabilize?
> > > By itself it does nothing, but if you set
> > > all the parameters to a huge value we will
> > > only interrupt when we see an empty ring.
> > > Which might be too much: pls try other values
> > > in the middle: e.g. make bufs half the ring,
> > > or bytes some small value, or packets some
> > > small value etc.
> > >
> > > Warning: completely untested.
> > >
> > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > > index aac05bc..6769cdc 100644
> > > --- a/drivers/vhost/net.c
> > > +++ b/drivers/vhost/net.c
> > > @@ -32,6 +32,13 @@
> > > * Using this limit prevents one virtqueue from starving others. */
> > > #define VHOST_NET_WEIGHT 0x80000
> > >
> > > +int tx_bytes_coalesce = 0;
> > > +module_param(tx_bytes_coalesce, int, 0644);
> > > +int tx_bufs_coalesce = 0;
> > > +module_param(tx_bufs_coalesce, int, 0644);
> > > +int tx_packets_coalesce = 0;
> > > +module_param(tx_packets_coalesce, int, 0644);
> > > +
> > > enum {
> > > VHOST_NET_VQ_RX = 0,
> > > VHOST_NET_VQ_TX = 1,
> > > @@ -127,6 +134,9 @@ static void handle_tx(struct vhost_net *net)
> > > int err, wmem;
> > > size_t hdr_size;
> > > struct socket *sock;
> > > + int bytes_coalesced = 0;
> > > + int bufs_coalesced = 0;
> > > + int packets_coalesced = 0;
> > >
> > > /* TODO: check that we are running from vhost_worker? */
> > > sock = rcu_dereference_check(vq->private_data, 1);
> > > @@ -196,14 +206,26 @@ static void handle_tx(struct vhost_net *net)
> > > if (err != len)
> > > pr_debug("Truncated TX packet: "
> > > " len %d != %zd\n", err, len);
> > > - vhost_add_used_and_signal(&net->dev, vq, head, 0);
> > > total_len += len;
> > > + packets_coalesced += 1;
> > > + bytes_coalesced += len;
> > > + bufs_coalesced += in;
> >
> > Should this instead be:
> > bufs_coalesced += out;
> >
> > Perusing the code I see that earlier there is a check to see if "in" is not
> > zero, and, if so, error out of the loop. After the check, "in" is not
> > touched until it is added to bufs_coalesced, effectively not changing
> > bufs_coalesced, meaning bufs_coalesced will never trigger the conditions
> > below.
>
> Yes. It definitely should be 'out'. 'in' should be 0 in the tx path.
>
> I tried a simpler version of this patch without any tunables by
> delaying the signaling until we come out of the for loop.
> It definitely reduced the number of vmexits significantly for small message
> guest to host stream test and the throughput went up a little.
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 9b3ca10..5f9fae9 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -197,7 +197,7 @@ static void handle_tx(struct vhost_net *net)
> if (err != len)
> pr_debug("Truncated TX packet: "
> " len %d != %zd\n", err, len);
> - vhost_add_used_and_signal(&net->dev, vq, head, 0);
> + vhost_add_used(vq, head, 0);
> total_len += len;
> if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
> vhost_poll_queue(&vq->poll);
> @@ -205,6 +205,8 @@ static void handle_tx(struct vhost_net *net)
> }
> }
>
> + if (total_len > 0)
> + vhost_signal(&net->dev, vq);
> mutex_unlock(&vq->mutex);
> }
>
>
> >
> > Or am I missing something?
> >
> > > + if (unlikely(packets_coalesced > tx_packets_coalesce ||
> > > + bytes_coalesced > tx_bytes_coalesce ||
> > > + bufs_coalesced > tx_bufs_coalesce))
> > > + vhost_add_used_and_signal(&net->dev, vq, head, 0);
> > > + else
> > > + vhost_add_used(vq, head, 0);
> > > if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
> > > vhost_poll_queue(&vq->poll);
> > > break;
> > > }
> > > }
> > >
> > > + if (likely(packets_coalesced > tx_packets_coalesce ||
> > > + bytes_coalesced > tx_bytes_coalesce ||
> > > + bufs_coalesced > tx_bufs_coalesce))
> > > + vhost_signal(&net->dev, vq);
> > > mutex_unlock(&vq->mutex);
> > > }
>
> It is possible that we can miss signaling the guest even after
> processing a few pkts, if we don't hit any of these conditions.
Yes. It really should be
if (likely(packets_coalesced && bytes_coalesced && bufs_coalesced))
vhost_signal(&net->dev, vq);
> > >
> >
> > Steve D.
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@...r.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists