lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 1 Nov 2008 06:48:59 +0100
From:	Lennert Buytenhek <buytenh@...tstofly.org>
To:	jeff@...zik.org
Cc:	netdev@...r.kernel.org
Subject: Re: [PATCH] mv643xx_eth: fix SMI bus access timeouts

On Sat, Nov 01, 2008 at 06:32:20AM +0100, Lennert Buytenhek wrote:

> The mv643xx_eth mii bus implementation uses wait_event_timeout() to
> wait for SMI completion interrupts.
> 
> If wait_event_timeout() would return zero, mv643xx_eth would conclude
> that the SMI access timed out, but this is not necessarily true --
> wait_event_timeout() can also return zero in the case where the SMI
> completion interrupt did happen in time but where it took longer than
> the requested timeout for the process performing the SMI access to be
> scheduled again.  This would lead to occasional SMI access timeouts
> when the system would be under heavy load.

FWIW, where I've been seeing this is mostly during heavy softirq
load, e.g. when doing routing when the box can't keep up.

When the system is being hammered like this, simple things like
querying the switch chip for its statistics counters (by doing
"ethtool -S <switch interface>") can take seconds, since querying the
hardware switch stats consists of doing a lot of MII accesses, and
each of those MII accesses takes tens of milliseconds to return because
the issuer of the MII access goes to sleep after issuing the MII access
waiting for an MII done interrupt, but won't get scheduled again to
issue its next MII access until the rx softirq has decided that it has
done enough looping.

This patch makes it a lot more bearable -- but it's still only a bit
of a stopgap:


diff --git a/kernel/softirq.c b/kernel/softirq.c
index c506f26..f7fd630 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -215,7 +215,7 @@ restart:
 	local_irq_disable();
 
 	pending = local_softirq_pending();
-	if (pending && --max_restart)
+	if (pending && !need_resched() && --max_restart)
 		goto restart;
 
 	if (pending)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ