lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080327.173418.18777696.davem@davemloft.net>
Date:	Thu, 27 Mar 2008 17:34:18 -0700 (PDT)
From:	David Miller <davem@...emloft.net>
To:	Matheos.Worku@....COM
Cc:	jesse.brandeburg@...el.com, jarkao2@...il.com,
	netdev@...r.kernel.org, herbert@...dor.apana.org.au,
	hadi@...erus.ca
Subject: Re: 2.6.24 BUG: soft lockup - CPU#X

From: Matheos Worku <Matheos.Worku@....COM>
Date: Thu, 27 Mar 2008 17:19:42 -0700

> Actually I am running a version of the nxge driver which uses only one 
> TX ring, no LLTX enabled so the driver does single threaded TX.

Ok.

> On the other hand, uperf (or iperf, netperf ) is running multiple TX
> connections in parallel and the connections are bound on multiple
> processors, hence they are running in parallel.

Yes, this is what I was interested in.

I think I know what's wrong.

If one cpu gets into the "qdisc remove, give to device" loop, other
cpus will simply add to the qdisc and that first cpu will do the all
of the TX processing.

This helps performance, but in your case it is clear that if the
device is fast enough and there are enough other cpus generating TX
traffic, it is quite trivial to get a cpu wedged there and never exit.

The code in question is net/sched/sch_generic.c:__qdisc_run(), it just
loops there until the device TX fills up or there are no more packets
in the qdisc queue.

qdisc_run() (in include/linux/pkt_sched.h) sets
__LINK_STATE_QDISC_RUNNING to tell other cpus that there is a cpu
processing the queue inside of __qdisc_run().

net/core/dev.c:dev_queue_xmit() then goes:

	if (q->enqueue) {
		/* Grab device queue */
		spin_lock(&dev->queue_lock);
		q = dev->qdisc;
		if (q->enqueue) {
			/* reset queue_mapping to zero */
			skb_set_queue_mapping(skb, 0);
			rc = q->enqueue(skb, q);
			qdisc_run(dev);
			spin_unlock(&dev->queue_lock);

			rc = rc == NET_XMIT_BYPASS ? NET_XMIT_SUCCESS : rc;
			goto out;
		}
		spin_unlock(&dev->queue_lock);
	}

The first cpu will get into __qdisc_run(), but the other
ones will just q->enqueue() and exit since the first cpu
has indicated it is processing the qdisc.

I'm not sure how we should fix this at the moment, we want
to keep the behavior but on the other hand we need to
break out of this so we don't get stuck here for too long.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ