lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061003202643.0e0ceab2@localhost.localdomain>
Date:	Tue, 3 Oct 2006 20:26:43 -0700
From:	Stephen Hemminger <shemminger@...l.org>
To:	Matthias Hentges <oe@...tges.net>
Cc:	Jeff Garzik <jeff@...zik.org>, Andrew Morton <akpm@...l.org>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Netdev List <netdev@...r.kernel.org>
Subject: Re: sky2 (was Re: 2.6.18-mm2)

On Wed, 04 Oct 2006 04:57:08 +0200
Matthias Hentges <oe@...tges.net> wrote:

> Hello Stephen,
> 
> Am Donnerstag, den 28.09.2006, 16:19 -0700 schrieb Stephen Hemminger:
> 
> > Here is the debug patch I sent to the first reporter of the problem.
> > I know what the offset is supposed to be, so if the PCI subsystem is
> > wrong, this will show. 
> > 
> > --- sky2.orig/drivers/net/sky2.c	2006-09-28 08:45:27.000000000 -0700
> > +++ sky2/drivers/net/sky2.c	2006-09-28 08:51:24.000000000 -0700
> > @@ -2463,6 +2463,7 @@
> >  
> >  	sky2_write8(hw, B0_CTST, CS_MRST_CLR);
> >  
> > +#define PEX_UNC_ERR_STAT 0x104		/* PCI extended error capablity */
> >  	/* clear any PEX errors */
> >  	if (pci_find_capability(hw->pdev, PCI_CAP_ID_EXP)) {
> >  		hw->err_cap = pci_find_ext_capability(hw->pdev, PCI_EXT_CAP_ID_ERR);
> > @@ -2470,6 +2471,15 @@
> >  			sky2_pci_write32(hw,
> >  					 hw->err_cap + PCI_ERR_UNCOR_STATUS,
> >  					 0xffffffffUL);
> > +		else
> > +			printk(KERN_ERR PFX "pci express found but not extended error support?\n");
> > +		
> > +		if (hw->err_cap + PCI_ERR_UNCOR_STATUS != PEX_UNC_ERR_STAT) {
> > +			
> > +			printk(KERN_ERR PFX "pci express error status register fixed from %#x to %#x\n",
> > +			       hw->err_cap, PEX_UNC_ERR_STAT - PCI_ERR_UNCOR_STATUS);
> > +			hw->err_cap = PEX_UNC_ERR_STAT - PCI_ERR_UNCOR_STATUS;
> > +		}
> >  	}
> >  
> >  	hw->pmd_type = sky2_read8(hw, B2_PMD_TYP);
> 
> while the above patch indeed removes the error messages from my previous
> mail, I have since seen random but reproduceable  freezes of the box in
> question. I believe they are sky2 related since the freeze can be
> triggered by continuous network traffic (like playing a movie over NFS
> etc.).

When it fixes what does the log say. I'm probably going to back out
the PCI express extended error using the pci_XXX functions.

> The freezes only happen with 2.6.18-mm2 and 2.6.18-mm3. 2.6.18-mm1 works
> perfectly fine.
> I've hooked up the box to my laptop via a serial cable and captured all
> kernel messages from booting up the machine to the freeze. You'll note
> that the last messages are from the sky2 driver ;)
> 

Does it still happen with linus git tree. If so, a git bisect might
help. It might not be sky2 related at all, there has been lots of changes.

> Once frozen the network is dead, the screen won't wake up from suspend
> and CAPSLOCK can not be toggled. SYSRQ (sp?) still works tho.
> 
> Any help in debugging this problem would be appreciated =)

The TX timeout is a symptom of a common bug still not fixed where
the transmitter stops. I'm working on reproducing it on my hardware and switches,
because without a reproducible test, its just shooting in the dark and
that isn't working.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ