lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 04 Jun 2010 23:24:59 +0300
From:	Timo Teräs <timo.teras@....fi>
To:	Phil Sutter <phil@....cc>
CC:	françois romieu <romieu@...zoreil.com>,
	netdev@...r.kernel.org
Subject: Re: still having r8169 woes with XID 18000000

On 06/04/2010 08:31 PM, Timo Teräs wrote:
> On 06/04/2010 04:43 PM, Phil Sutter wrote:
>> On Fri, Jun 04, 2010 at 04:02:11PM +0300, Timo Teräs wrote:
>>>> Comparing r8169-6.013 with it's predecessor 6.012, you'll find a newly
>>>> enabled function rtl8169_phy_power_up() as well as some more invocations
>>>> of rtl8169_phy_power_down().
>>>>
>>>> This is probably the solution to these (at least in our case) very
>>>> sporadic, but highly annoying, problems. In fact, when our NIC didn't
>>>> detect any link, it needed a full power-cycle (no success with
>>>> reset-button), so almost not workaroundable.
>>>
> However, removing the specific phy config code
> (rtl8169scd_hw_phy_config) which was introduced by commit 2e955856ff
> seems to solve it. At least I was not able to reproduce the failure with
> 20-30 module reloads.

Ok, I figured that either the data the phy config writes is bad, or mdio
io is failing, so I added some additional checks to mdio_write and
mdio_read. More loops (upto 2000 iterations) and debug print if it
ultimately failed. And it did!

At bootup I got this:

r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
r8169 0000:00:09.0: PCI->APIC IRQ transform: INT A -> IRQ 18
r8169 0000:00:09.0: no PCI Express capability
eth0: RTL8169sc/8110sc at 0xf835c000, 00:30:18:a6:2b:6c, XID 18000000 IRQ 18
r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
r8169 0000:00:0b.0: PCI->APIC IRQ transform: INT A -> IRQ 19
r8169 0000:00:0b.0: no PCI Express capability
eth1: RTL8169sc/8110sc at 0xf8360000, 00:30:18:a6:2b:6d, XID 18000000 IRQ 19
r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
r8169 0000:00:0c.0: PCI->APIC IRQ transform: INT A -> IRQ 16
r8169 0000:00:0c.0: no PCI Express capability
eth2: RTL8169sc/8110sc at 0xf8364000, 00:30:18:a6:2b:6e, XID 18000000 IRQ 16
r8169: mdio_write(f8364000, 0x00000003, 0000000a1) required 2000 cycles
r8169: mdio_write(f8364000, 0x00000000, 000001000) required 2000 cycles
r8169: mdio_write(f8364000, 0x00000000, 00000a0ff) required 2000 cycles
r8169: mdio_write(f8364000, 0x00000014, 00000fb54) required 2000 cycles

And eth2 was not working. Reloading the module gave a lot of other
mdio_write and mdio_read errors.

It seems to be pretty random when the errors occur, but that's the
reason why the NIC stops working: mdio_write() fails (one or more times)
at some crucial point of the board specific phy config code resulting in
bad state.

Any ideas how to debug this further?

I guess next step is to compile the Realtek driver and see if that works
right.

> One more curiosity: if i do a hard power reset, the NIC has green link
> indicator led after power up. When loading the kernel module it goes to
> orange/red. I wonder why the difference.

Figured this. At startup it goes to 100mbit/s fixed mode. After module
load it gets 1gb/s. Setting it manually to 100mbit/s changes the color
back to green. So it's just a speed indicator.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ