[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100817123702.3d09a35b@nehalam>
Date: Tue, 17 Aug 2010 12:37:02 -0700
From: Stephen Hemminger <shemminger@...tta.com>
To: Maciej Żenczykowski <zenczykowski@...il.com>
Cc: Stephen Hemminger <shemminger@...ux-foundation.org>,
Linux NetDev <netdev@...r.kernel.org>
Subject: Re: sky2 driver fails to handle "rx length error: status 0x5d60100
length 2982" gracefully
On Thu, 12 Aug 2010 13:31:13 -0700
Maciej Żenczykowski <zenczykowski@...il.com> wrote:
> > The status values indicate that the GMAC (frame parser) got a reasonable
> > size frame but the DMA merged frames together. This indicates a timing
> > problem. There are some bits which even with NDA programmers manual doesn't
> > help with. The Linux driver expects the BIOS or EEPROM to set them correctly
> > because different problems different settings.
> >
> > There is firmware in eeprom that configures internal state. On one motherboard
> > the vendor provided an update. There is no good way to update this from Linux,
> > you need to go system vendor and install firmware with their native OS (ie Windows
> > or MacOS).
>
> Perfectly reasonable response. If there was a firmware update fix,
> I'd apply it...
> That would presumably prevent this from ever happening in the first place.
>
> But why doesn't the network driver reset the nic when it detects this
> 'rx length' error?
>
> I'm not asking for the error to not happen (besides it happens very rarely)...
>
> I'm asking, why does this error happening permanently hose the network driver.
> Once this happens the network card is not usable - traffic does not
> flow through it.
> You need to "ip link set down && ... up" to fix it. Isn't this
> something the driver could and should do all by itself?
Also, the driver could schedule a reset (that is what the watchdog does),
but it looks like the receive DMA is walking past the end of the packet
and that is really dangerous since it could clobber random memory.
You might want to increase the size of rx DMA buffer and dump the
contents of the receive buffer to see if there is a memory corruption
risk. If the End Of Frame DMA hardware is not working, there is a real
danger if the driver silently continues.
--
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists