lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090920183458.GC28315@1wt.eu>
Date:	Sun, 20 Sep 2009 20:34:58 +0200
From:	Willy Tarreau <w@....eu>
To:	Grozdan <neutrino8@...il.com>
Cc:	Stephen Hemminger <shemminger@...tta.com>,
	linux-kernel@...r.kernel.org
Subject: Re: sky2 rx length errors

Hi guys,

On Sun, Sep 20, 2009 at 08:16:02PM +0200, Grozdan wrote:
> 2009/9/20 Stephen Hemminger <shemminger@...tta.com>:
> 
> >
> > This error status occurs if the length reported by the PHY does not
> > match the len reported by the DMA engine.  The error status is:
> >   0x4420100 = length 1090 + broadcast packet...
> >
> > No idea what is on your network, but perhaps there is some MTU confusion?
> > Since martian destination seems related, knowing more about that packet
> > might help.
> >
> 
> Hi,
> 
> Thanks for the reply. There's nothing on my home network here. It is
> just a direct connection from my PC to my cable modem and there's
> nothing in between. I've googled a bit and it seems others also
> encounter this problem.

I've encountered similar issues on early 8053 chips too. Those were
soldered on motherboard of network servers bought about 4 years ago.
No matter what trick I could try, change drivers, enable/disable flow
control, change negociation speed, etc... the PHY would occasionally
and randomly get mad and start shifting received frames by a few bytes,
thus causing loss of network connectivity. The logs would also display
martians, depending on the bytes in the frame which appeared in the
IP header once shifted.

Sometimes it would automatically get back after a chip reset, sometimes
not. It seemed that disabling flow control helped a bit, but it was not
fantastic. It would randomly hang every 1-30 days, which made the issue
rather hard to debug.

I don't precisely remember the rev. of the chip, but I remember that
it was pretty old and that more recent machines had a much larger
number that never exhibited the issue. Also, my desktop right here
runs off a 88E8056 (~= two 8053s) and has never failed yet.

So I really think that there was a horrible batch of chips in its
early days.

> I've read a few posts on the Ubuntu bugzilla
> where people change the MTU from 1500 to 1492 and this fixes the
> problem. However, even with this, some report that the problem is
> still there. I did the same and it didn't change anything for me.

Did not help for me either.

> So I
> disabled my onboard NIC and added a 3Com one which has been working
> perfectly so far and I think I'll just keep using it instead of the
> Marvell one.

That's the best you can do if you happen to have one of those buggy
chips. We had to stuff intel NICs in the servers causing trouble at
the customer's and it solved the issue too.

Regards,
Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ