lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 25 Feb 2008 13:42:16 -0800
From:	Stephen Hemminger <shemminger@...ux-foundation.org>
To:	Marin Mitov <mitov@...p.bas.bg>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: net: tx timeouts with skge, 8139too, dmfe drivers/NICs

On Mon, 25 Feb 2008 23:36:06 +0200
Marin Mitov <mitov@...p.bas.bg> wrote:

> On Monday 25 February 2008 10:53:01 pm you wrote:
> > Marin Mitov wrote:
> > > Hi all,
> > >
> > > I experience very rare freezes at heavy outbound traffic
> > > (sending ~4GB DVD image to another host(s) on the same LAN)
> > > using skge driver (NIC on the mobo) as well as (recently tested)
> > > using rtl8139 or dmfe NICs on the PCI bus. There is a single
> > > switch between them (tested with another one just to exclude
> > > a faulty switch).
> > >
> > > skge <--> Marvell 88E8001 chip
> > > 8139too <--> Realtek 8136B chip
> > > dmfe <--> Davicom DM9102 chip
> > >
> > > Symptoms are similar: tx timeouts and no more net activity.
> > > KDE desktop works, computational programs - work, the machine
> > > is usable, but cannot ping, nor can be ping-ed anymore.
> > > rmmod && modprobe the respective modules repairs the problem.
> > > Simple surfing/e-mailing from it do not trigger the problem.
> > >
> > > The machine is used as LTSP server for old PCs (as X terminals)
> > > (mostly outbound traffic) and is not usable as such due to this
> > > problem.
> > >
> > > The kernel is 2.6.24.2-SMP/x86_32 (PREEMPT or not - NO difference).
> > >
> > > As far as this happens with 3 different NICs/drivers could it be
> > > a problem in the (common for all of them) networking subsystem?
> >
> > A TX timeout (like hardware timeouts, in general) is a very generic
> > behavior, with many causes.
> >
> > In general, when you see timeouts with varied hardware and drivers,
> > you're almost always dealing with a problem with interrupt delivery, or
> 
> All the drivers are using #INTA on PCI bus (no MSI/MSI-X).
> 
> "problem with interrupt delivery" - you suspect interrupts incorrectly
>  disabled (lost) in the drivers or faulty hardware(motherboard)?
> 
> > a generic system problem, rather than bugs in the network stack or all
> 
> "a generic system problem" - bad config or faulty hardware(motherboard)?
> 
> Where I should look for the problem?
> 
> Just for info: the system is very stable - uptime (if no power outages) could
> be a month or more (rebooting for kernel changes or updates).
> 
> Marin Mitov

Make sure the interrupt is showing up as level triggered in /proc/interrupts.
The BIOS may be configuring it as edge-triggered and that won't work with
Ethernet drivers that use NAPI.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ