lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Date:	Tue, 25 Mar 2008 17:15:18 +0200
From:	Marin Mitov <mitov@...p.bas.bg>
To:	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	Stephen Hemminger <shemminger@...ux-foundation.org>
Subject: [BUG] unsolicited IRQ disabling at the IO-APIC leading to eth0 freezes

WAS: net: tx timeouts with skge, 8139too, dmfe drivers/NICs

Hi all,

I am observing rare freezes(blocking) of eth0 with:

NETDEV WATCHDOG: eth0: transmit timed out

in dmesg output.

The problem has been already described in a previous message:

http://lkml.org/lkml/2008/2/25/312

with some additional observations, as described in:

http://lkml.org/lkml/2008/3/12/96

Recently I found that the IRQ# used by the driver/NIC has been
somehow disabled/masked  at the IO-APIC, blocking the interrupts to
the driver irq_handler, so the messages: NETDEV WATCHDOG....

Pitting:

disable_irq(_nosync)(irq#);
enable_irq(irq#);

in the dev->tx_timeout() method restores the working state of eth0,
(at least for skge), so the interface now works posting (sometimes)

NETDEV WATCHDOG: eth0: transmit timed out

messages in the log.

If I EXPORT_SYMBOL(irq_desc) (from kernel/irq/handle.c)
I am able to restore the working state of the eth0 interface with

(irq_desc + irq#)->chip->enable(irq#)

or

(irq_desc + irq#)->chip->unmask(irq#)

(properly locked),  instead of disable/enable_irq(irq#), 

Just for info the driver irq_handler always return IRQ_HANDLED and never 
(verified) IRQ_NONE, so the irq# is not disabled due to unhandled irqs - 
yes the driver declares IRQF_SHARED, but is the only one on this irq# 
(as viewed in /proc/interrupts).

The system is runing kernel-2.6.14.3-SMP (AMD64 X2) on Asus A7V Deluxe

Am I observing IRQ locking/rice condition OUT of net driver?

All suggestions for further investigation are wellcome.

Marin Mitov


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists