netdev - Re: [PATCH net-next 0/5] net: phy: improve and simplify phylib state machine

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <67c3bafb-5b4b-c33e-fd84-4cc6919b4bcb@gmail.com>
Date:   Wed, 7 Nov 2018 21:45:21 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     Florian Fainelli <f.fainelli@...il.com>,
        David Miller <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 0/5] net: phy: improve and simplify phylib state
 machine

On 07.11.2018 21:21, Andrew Lunn wrote:
> On Wed, Nov 07, 2018 at 09:05:49PM +0100, Heiner Kallweit wrote:
>> On 07.11.2018 20:48, Andrew Lunn wrote:
>>> On Wed, Nov 07, 2018 at 08:41:52PM +0100, Heiner Kallweit wrote:
>>>> This patch series is based on two axioms:
>>>>
>>>> - During autoneg a PHY always reports the link being down
>>>
>>> Hi Heiner
>>>
>>> I think that is a risky assumption to make.
>>>
>> I wasn't sure initially too (found no clear rule in 802.3 clause 22)
>> and therefore asked around. Florian agrees to the assumption,
>> see here: https://www.spinics.net/lists/netdev/msg519242.html
>>
>> If a PHY reports the link as up then every user would assume that
>> data can be transferred. But that's not the case during aneg.
>> Therefore reporting the link as up during aneg wouldn't make sense.
> 
> Hi Heiner
> 
> If auto-neg has already been completed once before, i can see a lazy
> hardware designed not reporting link down, or at least, not until
> auto-neg actually fails.
> 
"aneg finished" flag means that the aneg parameters in the register set
are valid. Once the link goes down that's not necessarily the case any
longer. E.g. some PHYs have an "auto speed down" feature and reduce
the speed to save power once they detect the link is down.
Of course I can not rule out that there are broken designs (or as you
stated more politely: lazy designs) out there. But in this case I assume
we would see issues already. And we would have to think about whether we
want to support such broken / lazy designs in phylib.

> And what about if link is down for too short a time for us to notice?
> I've seen some code fail because the kernel went off and did something
> else for too long, and a state change was missed. 
> 
This is a case we have already, independent of my change.
genphy_update_link() reads BMSR twice, thus ignoring potential latched
info about a temporary link failure. When polling phylib ignores
everything that happens between two poll intervals.

>>> What happens if this assumption is incorrect?
>>>
>> Then we have to flush this patch series down the drain ;)
>> At least I would have to check in detail which parts need to be
>> changed. I clearly mention the assumptions so that every
>> reviewer can check whether he agrees.
> 
> Thanks for doing that. I want to be happy this is safe, and not going
> to introduce regressions.
> 
>    Andrew
>