lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 17 Dec 2009 15:33:57 +0530
From:	Krishna Kumar2 <krkumar2@...ibm.com>
To:	Sridhar Samudrala <sri@...ibm.com>
Cc:	Herbert Xu <herbert@...dor.apana.org.au>, mst@...hat.com,
	netdev@...r.kernel.org, Rusty Russell <rusty@...tcorp.com.au>
Subject: Re: [RFC PATCH] Regression in linux 2.6.32 virtio_net seen with vhost-net

> Sridhar Samudrala <sri@...ibm.com>
>
> Re: [RFC PATCH] Regression in linux 2.6.32 virtio_net seen with vhost-net
>
> Herbert Xu wrote:
> > On Wed, Dec 16, 2009 at 09:05:32PM -0800, Sridhar Samudrala wrote:
> >
> >> I think sch_direct_xmit() is not even calling dev_hard_start_xmit() as

> >> the tx queue is stopped
> >> and does a dev_requeue_skb() and returns NETDEV_TX_BUSY.
> >>
> >
> > Yes but if the queue was stopped then we shouldn't even get into
> > sch_direct_xmit.
> I don't see any checks for txq_stopped in the callers of
sch_direct_xmit() :
> __dev_xmit_skb() and qdisc_restart().  Both these routines get the txq
> and call
> sch_direct_xmit() which checks if tx queue is stopped or frozen.
>
> Am i missing something?

Yes - dequeue_skb.

The final skb, before the queue was stopped, is transmitted by
the driver. The next time sch_direct_xmit is called, it gets a
skb and finds the device is stopped and requeue's the skb. For
all subsequent xmits, dequeue_skb returns NULL (and the other
caller - __dev_xmit_skb can never be called since qdisc_qlen is
true) and thus requeue's will not happen. This also means that
the number of requeues you see (eg 283K in one run) is the number
of times the queue was stopped and restarted. So it looks like
driver either:

1. didn't stop the queue when xmiting a packet successfully (the
      condition being that it would not be possible to xmit the
      next skb). But this doesn't seem to be the case.
2. wrongly restarted the queue. Possible - since a few places
      use both the start & wake queue api's.

Thanks,

- KK

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ