netdev - Re: [RFC PATCH] Regression in linux 2.6.32 virtio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <OF6A94FA41.B9EF2995-ON6525768F.004B99AC-6525768F.004F1634@in.ibm.com>
Date:	Thu, 17 Dec 2009 20:05:02 +0530
From:	Krishna Kumar2 <krkumar2@...ibm.com>
To:	Herbert Xu <herbert@...dor.apana.org.au>
Cc:	"David S. Miller" <davem@...emloft.net>,
	Jarek Poplawski <jarkao2@...il.com>, mst@...hat.com,
	netdev@...r.kernel.org, Rusty Russell <rusty@...tcorp.com.au>,
	Sridhar Samudrala <sri@...ibm.com>
Subject: Re: [RFC PATCH] Regression in linux 2.6.32 virtio_net seen with	vhost-net

Herbert Xu <herbert@...dor.apana.org.au> wrote on 12/17/2009 07:14:08 PM:

> > I am confused. Isn't dequeue_skb returning NULL for 2nd - nth skbs
> > till the queue is restarted, so how is it broken?
>
> Sorry I didn't read dev_dequeue carefully enough.  Indeed it
> correctly checks the queue status so the loop that I thought
> was there doesn't exist.
>
> The requeues are probably caused by the driver still.  Was Sridhar
> testing Rusty's latest patch?

I was using numbers from his first test run. He has not posted
the requeue numbers for the NAPI run. But he ran with NAPI, and
said:

"I had to change virtnet_xmit_poll() to get it working. See below.
With this change, i don't see any 'queue full' warnings, but requeues
are still happening at the qdisc level (sch_direct_xmit() finds that
tx queue is stopped and does requeues)".

I think the bug is in this check:

+     if (vi->capacity >= 2 + MAX_SKB_FRAGS) {
+           /* Suppress further xmit interrupts. */
+           vi->svq->vq_ops->disable_cb(vi->svq);
+           napi_complete(xmit_napi);
+
+           /* Don't wake it if link is down. */
+           if (likely(netif_carrier_ok(vi->vdev)))
+                 netif_wake_queue(vi->dev);
+     }

We wake up too fast, just enough space for one more skb to be sent
before the queue is stopped again. And hence no more messages about
queue full, but lot of requeues. The qdisc code is doing the correct
thing, but we need to increase the limit here.

Can we try with some big number, 64, 128?

Thanks,

- KK

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html