linux-kernel - Re: [PATCH v2] e1000e: Increase iteration on polling MDIC ready bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200924155355.GC3821492@lunn.ch>
Date:   Thu, 24 Sep 2020 17:53:55 +0200
From:   Andrew Lunn <andrew@...n.ch>
To:     Kai-Heng Feng <kai.heng.feng@...onical.com>
Cc:     jeffrey.t.kirsher@...el.com,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        "moderated list:INTEL ETHERNET DRIVERS" 
        <intel-wired-lan@...ts.osuosl.org>,
        "open list:NETWORKING DRIVERS" <netdev@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] e1000e: Increase iteration on polling MDIC ready bit

On Thu, Sep 24, 2020 at 11:09:58PM +0800, Kai-Heng Feng wrote:
> We are seeing the following error after S3 resume:
> [  704.746874] e1000e 0000:00:1f.6 eno1: Setting page 0x6020
> [  704.844232] e1000e 0000:00:1f.6 eno1: MDI Write did not complete
> [  704.902817] e1000e 0000:00:1f.6 eno1: Setting page 0x6020
> [  704.903075] e1000e 0000:00:1f.6 eno1: reading PHY page 769 (or 0x6020 shifted) reg 0x17
> [  704.903281] e1000e 0000:00:1f.6 eno1: Setting page 0x6020
> [  704.903486] e1000e 0000:00:1f.6 eno1: writing PHY page 769 (or 0x6020 shifted) reg 0x17
> [  704.943155] e1000e 0000:00:1f.6 eno1: MDI Error
> ...
> [  705.108161] e1000e 0000:00:1f.6 eno1: Hardware Error
> 
> As Andrew Lunn pointed out, MDIO has nothing to do with phy, and indeed
> increase polling iteration can resolve the issue.
> 
> While at it, also move the delay to the end of loop, to potentially save
> 50 us.

You are unlikely to save any time. 64 bits at 2.5MHz is 25.6uS. So it
is very unlikely doing a read directly after setting is going is going
to have E1000_MDIC_READY set. So this change likely causes an addition
read on MDIC. Did you profile this at all, for the normal case?

I also don't fully understand the fix. You are now looping up to 6400
times, each with a delay of 50uS. So that is around 12800 times more
than it actually needs to transfer the 64 bits! I've no idea how this
hardware works, but my guess would be, something is wrong with the
clock setup?

     Andrew