lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 20 Mar 2020 09:07:21 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Russell King - ARM Linux admin <linux@...linux.org.uk>
Cc:     Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        David Miller <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 1/3] net: phy: add and use phy_check_downshift

On 19.03.2020 18:08, Russell King - ARM Linux admin wrote:
> On Thu, Mar 19, 2020 at 05:36:30PM +0100, Heiner Kallweit wrote:
>> On 19.03.2020 14:58, Russell King - ARM Linux admin wrote:
>>> On Thu, Mar 19, 2020 at 02:04:29PM +0100, Andrew Lunn wrote:
>>>>> The only time that this helps is if PHY drivers implement reading a
>>>>> vendor register to report the actual link speed, and the PHY specific
>>>>> driver is used.
>>>>
>>>> So maybe we either need to implement this reading of the vendor
>>>> register as a driver op, or we have a flag indicating the driver is
>>>> returning the real speed, not the negotiated speed?
>>>
>>> I'm not sure it's necessary to have another driver op.  How about
>>> this for an idea:
>>>
>>> - add a flag to struct phy_device which indicates the status of
>>>   downshift.
>>> - on link-up, check the flag and report whether a downshift occurred,
>>>   printing whether a downshift occurred in phy_print_status() and
>>>   similar places.  (Yes, I know that there are some network drivers
>>>   that don't use phy_print_status().)
>>>
>>> The downshift flag could be made tristate - "unknown", "not downshifted"
>>> and "downshifted" - which would enable phy_print_status() to indicate
>>> whether there is downshift supported (and hence whether we need to pay
>>> more attention to what is going on when there is a slow-link report.)
>>>
>>> Something like:
>>>
>>> For no downshift:
>>> 	Link is Up - 1Gbps/Full - flow control off
>>> For downshift:
>>> 	Link is Up - 100Mbps/Full (downshifted) - flow control off
>>> For unknown:
>>> 	Link is Up - 1Gbps/Full (unknown downshift) - flow control off
>>>
>>> which has the effect of being immediately obvious if the driver lacks
>>> support.
>>>
>>> We may wish to consider PHYs which support no downshift ability as
>>> well, which should probably set the status to "not downshifted" or
>>> maybe an "unsupported" state.
>>>
>>> This way, if we fall back to the generic PHY driver, we'd get the
>>> "unknown" state.
>>>
>>
>> I'd like to split the topics. First we have downshift detection,
>> then we have downshift reporting/warning.
>>
>> *Downshift detection*
>> Prerequisite of course is that the PHY supports reading the actual,
>> possibly downshifted link speed (typically from a vendor-specific
>> register). Then the PHY driver has to set phydev->speed to the
>> actual link speed in the read_status() implementation.
>>
>> For the actual downshift detection we have two options:
>> 1. PHY provides info about a downshift event in a vendor-specific
>>    register or as an interrupt source.
>> 2. The generic method, compare actual link speed with the highest
>>    mutually advertised speed.
>> So far I don't see a benefit of option 1. The generic method is
>> easier and reduces complexity in drivers.
>>
>> The genphy driver is a fallback, and in addition may be intentionally
>> used for PHY's that have no specific features. A PHY with additional
>> features in general may or may not work properly with the genphy
>> driver. Some RTL8168-internal PHY's fail miserably with the genphy
>> driver. I just had a longer discussion about it caused by the fact
>> that on some distributions r8169.ko is in initramfs but realtek.ko
>> is not.
>> On a side note: Seems that so far the kernel doesn't provide an
>> option to express a hard module dependency that is not a code
>> dependency.
> 
> So how do we address the "fallback to genphy driver" problem for PHYs
> that do mostly work with genphy?  It is very easy to do, for example
> by omitting the PHY specific driver from the kernel configuration.
> Since the system continues to work in these cases, it may go unnoticed.
> 

A blacklist of PHY ID's known to have hard or soft issues when operated
with the genphy driver could break existing systems. But maybe we could
at least print a warning that genphy on the respective PHY ID is an
emergency fallback only and there may be functional restrictions.

However such a blacklist may be useful for PHY's that are known to not
work properly with genphy. RTL8211B is an example. On this PHY the
MMD registers are used for different proprietary purposes.
Writing to these registers in genphy_config_eee_advert() messes
up the PHY and makes it non-functional.

>> *Downshift reporting/warning*
>> In most cases downshift is caused by some problem with the cabling.
>> Users like the typical Ubuntu user in most cases are not familiar
>> with the concept of PHY downshift and what causes a downshift.
>> Therefore it's not sufficient to just report a downshift, we have
>> to provide the user with a hint what to do.
>> Adding the "downshifted" info to phy_print_status() is a good idea,
>> however I'd see it as an optional addition to the mentioned hint
>> to the user what to do.
>> The info "unknown downshift" IMO would just cause confusion. If we
>> have nothing to say, then why say something. Also users may interpret
>> "unknown" as "there's something wrong".
> 
> So you think reporting not-downshifted and no downshift capability
> implemented by the driver should appear to be identical?
> 
> You claimed as part of the patch description that a downshifted link
> was difficult to diagnose; it seems you aren't actually solving that
> problem - and in that case I would suggest that you should not be
> mentioning it in the commit log.
> 

Powered by blists - more mailing lists