lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9718cf64-544a-9efb-409d-ada7c2d927f1@mellanox.com>
Date:   Thu, 25 Jun 2020 16:34:48 +0300
From:   Amit Cohen <amitc@...lanox.com>
To:     Jacob Keller <jacob.e.keller@...el.com>
Cc:     Ido Schimmel <idosch@...sch.org>, netdev@...r.kernel.org,
        davem@...emloft.net, kuba@...nel.org, jiri@...lanox.com,
        petrm@...lanox.com, mlxsw@...lanox.com, mkubecek@...e.cz,
        andrew@...n.ch, f.fainelli@...il.com, linux@...pel-privat.de,
        Ido Schimmel <idosch@...lanox.com>
Subject: Re: [PATCH net-next 04/10] Documentation: networking:
 ethtool-netlink: Add link extended state

On 24-Jun-20 22:07, Jacob Keller wrote:
> 
> 
> On 6/24/2020 1:19 AM, Ido Schimmel wrote:
>> From: Amit Cohen <amitc@...lanox.com>
>>
>> Add link extended state attributes.
>>
>> Signed-off-by: Amit Cohen <amitc@...lanox.com>
>> Reviewed-by: Petr Machata <petrm@...lanox.com>
>> Reviewed-by: Jiri Pirko <jiri@...lanox.com>
>> Signed-off-by: Ido Schimmel <idosch@...lanox.com>
> 
> I really like this concept.
> 
> Reviewed-by: Jacob Keller <jacob.e.keller@...el.com>
> 
>> ---
>>  Documentation/networking/ethtool-netlink.rst | 110 ++++++++++++++++++-
>>  1 file changed, 106 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
>> index 82470c36c27a..a7cc53f905f5 100644
>> --- a/Documentation/networking/ethtool-netlink.rst
>> +++ b/Documentation/networking/ethtool-netlink.rst
>> @@ -443,10 +443,11 @@ supports.
>>  LINKSTATE_GET
>>  =============
>>  
>> -Requests link state information. At the moment, only link up/down flag (as
>> -provided by ``ETHTOOL_GLINK`` ioctl command) is provided but some future
>> -extensions are planned (e.g. link down reason). This request does not have any
>> -attributes.
>> +Requests link state information. Link up/down flag (as provided by
>> +``ETHTOOL_GLINK`` ioctl command) is provided. Optionally, extended state might
>> +be provided as well. In general, extended state describes reasons for why a port
>> +is down, or why it operates in some non-obvious mode. This request does not have
>> +any attributes.
>>  
> 
> Ok, so basically in addition to the standard ETHTOOL_GLINK data, we also
> add additional optional extended attributes over the netlink interface.
> Neat.
> 
>>  Request contents:
>>  
>> @@ -461,16 +462,117 @@ Kernel response contents:
>>    ``ETHTOOL_A_LINKSTATE_LINK``          bool    link state (up/down)
>>    ``ETHTOOL_A_LINKSTATE_SQI``           u32     Current Signal Quality Index
>>    ``ETHTOOL_A_LINKSTATE_SQI_MAX``       u32     Max support SQI value
>> +  ``ETHTOOL_A_LINKSTATE_EXT_STATE``     u8      link extended state
>> +  ``ETHTOOL_A_LINKSTATE_EXT_SUBSTATE``  u8      link extended substate
>>    ====================================  ======  ============================
> 
> Ok so we have categories (state) and each category can have sub-states
> indicating the specific failure. Good.
> 
>>  
>>  For most NIC drivers, the value of ``ETHTOOL_A_LINKSTATE_LINK`` returns
>>  carrier flag provided by ``netif_carrier_ok()`` but there are drivers which
>>  define their own handler.
>>  
>> +``ETHTOOL_A_LINKSTATE_EXT_STATE`` and ``ETHTOOL_A_LINKSTATE_EXT_SUBSTATE`` are
>> +optional values. ethtool core can provide either both
>> +``ETHTOOL_A_LINKSTATE_EXT_STATE`` and ``ETHTOOL_A_LINKSTATE_EXT_SUBSTATE``,
>> +or only ``ETHTOOL_A_LINKSTATE_EXT_STATE``, or none of them.
>> +
>>  ``LINKSTATE_GET`` allows dump requests (kernel returns reply messages for all
>>  devices supporting the request).
>>  
> 
> Good to clarify that it is allowed to specify only the main state, if a
> substate isn't known.
> 

I described above all the options:
"ethtool core can provide either (1) both ``ETHTOOL_A_LINKSTATE_EXT_STATE``
and ``ETHTOOL_A_LINKSTATE_EXT_SUBSTATE``, or (2) only ``ETHTOOL_A_LINKSTATE_EXT_STATE``,
or (3) none of them"

I think that #2 is what you want, no?

>>  
>> +Link extended states:
>> +
>> +  ============================    =============================================
>> +  ``Autoneg``                     States relating to the autonegotiation or
>> +                                  issues therein
>> +
>> +  ``Link training failure``       Failure during link training
>> +
>> +  ``Link logical mismatch``       Logical mismatch in physical coding sublayer
>> +                                  or forward error correction sublayer
>> +
>> +  ``Bad signal integrity``        Signal integrity issues
>> +
>> +  ``No cable``                    No cable connected
>> +
>> +  ``Cable issue``                 Failure is related to cable,
>> +                                  e.g., unsupported cable
>> +
>> +  ``EEPROM issue``                Failure is related to EEPROM, e.g., failure
>> +                                  during reading or parsing the data
>> +
>> +  ``Calibration failure``         Failure during calibration algorithm
>> +
>> +  ``Power budget exceeded``       The hardware is not able to provide the
>> +                                  power required from cable or module
>> +
>> +  ``Overheat``                    The module is overheated
>> +  ============================    =============================================
> 
> A nice variety of states. I think it could be argued that "no cable" is
> a sub-state of "cable issues" but personally I like that it's separate.
> 
> I can't think of any other states offhand, but we have a u8, so we have
> plenty of space to add more states if/when we think of them in the future.
> 
>> +
>> +Link extended substates:
>> +
>> +  Autoneg substates:
>> +
>> +  ==============================================    =============================================
>> +  ``No partner detected``                           Peer side is down
>> +
>> +  ``Ack not received``                              Ack not received from peer side
>> +
>> +  ``Next page exchange failed``                     Next page exchange failed
>> +
>> +  ``No partner detected during force mode``         Peer side is down during force mode or there
>> +                                                    is no agreement of speed
>> +
> 
> This feels like it could be two separate states. It seems weird to
> combine "peer side is down during force" with "no agreement of speed".
> Am I missing something here?
> 

>From high-level API point of view it makes sense, but we get this reason from our FW for both cases.
When link is configured to force mode, it doesn't know his partner's state, so partner is down and partner has different speed are same.

>> +  ``FEC mismatch during override``                  Forward error correction modes in both sides
>> +                                                    are mismatched
>> +
>> +  ``No HCD``                                        No Highest Common Denominator
>> +  ==============================================    =============================================
>> +
>> +  Link training substates:
>> +
>> +  ==============================================    =============================================
>> +  ``KR frame lock not acquired``                    Frames were not recognized, the lock failed
>> +
>> +  ``KR link inhibit timeout``                       The lock did not occur before timeout
>> +
>> +  ``KR Link partner did not set receiver ready``    Peer side did not send ready signal after
>> +                                                    training process
>> +
>> +  ``Remote side is not ready yet``                  Remote side is not ready yet
>> +
>> +  ==============================================    =============================================
>> +
>> +  Link logical mismatch substates:
>> +
>> +  ==============================================    =============================================
>> +  ``PCS did not acquire block lock``                Physical coding sublayer was not locked in
>> +                                                    first phase - block lock
>> +
>> +  ``PCS did not acquire AM lock``                   Physical coding sublayer was not locked in
>> +                                                    second phase - alignment markers lock
>> +
>> +  ``PCS did not get align_status``                  Physical coding sublayer did not get align
>> +                                                    status
>> +
>> +  ``FC FEC is not locked``                          FC forward error correction is not locked
>> +
>> +  ``RS FEC is not locked``                          RS forward error correction is not locked
>> +  ==============================================    =============================================
>> +
>> +  Bad signal integrity substates:
>> +
>> +  ==============================================    =============================================
>> +  ``Large number of physical errors``               Large number of physical errors
>> +
>> +  ``Unsupported rate``                              The system attempted to operate the cable at
>> +                                                    a rate that is not formally supported, which
>> +                                                    led to signal integrity issues
>> +
>> +  ``Unsupported cable``                             Unsupported cable
>> +
>> +  ``Cable test failure``                            Cable test failure
>> +  ==============================================    =============================================
>> +
> 
> Not every state has substates. Makes sense, since some of the main
> states are pretty straight forward.
> 
> This feels very promising, and enables providing some useful information
> to users about why something isn't working as expected. I think it's a
> significant improvement to the status quo, given that a device can
> report this data.
> 
> Thanks,
> Jake
> 

Thanks for the comments.

>>  DEBUG_GET
>>  =========
>>  
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ