lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Jun 2018 20:24:34 -0700
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Don Bollinger <don@...bollingers.org>, Andrew Lunn <andrew@...n.ch>
Cc:     Tom Lendacky <thomas.lendacky@....com>,
        Arnd Bergmann <arnd@...db.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-kernel@...r.kernel.org, brandon_chuang@...e-core.com,
        wally_wang@...ton.com, roy_lee@...e-core.com,
        rick_burchett@...e-core.com, quentin.chang@...ntatw.com,
        steven.noble@...switch.com, jeffrey.townsend@...switch.com,
        scotte@...ulusnetworks.com, roopa@...ulusnetworks.com,
        David Ahern <dsa@...ulusnetworks.com>,
        luke.williams@...onical.com, Guohan Lu <gulv@...rosoft.com>,
        Russell King <rmk+kernel@....linux.org.uk>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH] optoe: driver to read/write SFP/QSFP EEPROMs



On 06/14/2018 07:26 PM, Don Bollinger wrote:
> On Tue, Jun 12, 2018 at 08:11:09PM +0200, Andrew Lunn wrote:
>>> There's an SFP driver under drivers/net/phy.  Can that driver be extended
>>> to provide this support?  Adding Russel King who developed sfp.c, as well
>>> at the netdev mailing list.
>>
>> I agree, the current SFP code should be used.
>>
>> My observations seem to be there are two different ways {Q}SFP are used:
>>
>> 1) The Linux kernel has full control, as assumed by the devlink/SFP
>> frame work. We parse the SFP data to find the capabilities of the SFP
>> and use it to program the MAC to use the correct mode. The MAC can be
>> a NIC, but it can also be a switch. DSA is gaining support for
>> PHYLINK, so SFP modules should just work with most switches which DSA
>> support.  And there is no reason a plain switchdev switch can not use
>> PHYLINK.
>>
>> 2) Firmware is in control of the PHY layer, but there is a wish to
>> expose some of the data which is available via i2c from the {Q}SFP to
>> linux.
>>
>> It appears this optoe supports this second case. It does not appear to
>> support any in kernel API to actually make use of the SFP data in the
>> kernel.
>>
>> We should not be duplicating code. We should share the SFP code for
>> both use cases above. There is also a Linux standard API for getting
>> access to this information. ethtool -m/--module-info. Anything which
>> is exporting {Q}SFP data needs to use this API.
>>
>>    Andrew
> 
> Actually this is better described by a third use case.  The target
> switches are PHY-less (see various designs at
> www.compute.org/wiki/Networking/SpecsAndDesigns). The AS5712 for example
> says "The AS5712-54X is a PHY-Less design with the SFP+ and QSFP+
> connections directly attaching to the Serdes interfaces of the Broadcom
> BCM56854 720G Trident 2 switching silicon..."
> 
> The electrical controls of the {Q}SFP devices (TxDisable for example)
> are organized in a platform dependent way, through CPLD devices, and
> managed by a platform specific CPLD driver.
> 
> The i2c bus is muxed from the CPU to all of the {Q}SFP devices, which
> are set up as standard linux i2c devices
> (/sys/bus/i2c/devices/i2c-xxxx).
> 
> There is no MDIO bus between the CPU and the {Q}SFP devices.
> 
>> 2) Firmware is in control of the PHY layer, but there is a wish to
>> expose some of the data which is available via i2c from the {Q}SFP to
>> linux.
> 
> So the switch silicon is in control of the PHY layer.  The platform
> driver is in control of the electrical interfaces.  And the EEPROM data
> is available via I2C.
> 
> And, there isn't actually 'a wish to expose' the EEPROM data to linux
> (the kernel).  It turns out that none of the NOS partners I'm working
> with use that data *in the kernel*.  It is all managed from user space.
> 
> More generally, I think sfp.c and optoe are not actually trying to
> accomplish the same thing at all.  sfp.c combines all three functions
> (PHY, electrical control, EEPROM access).  optoe is only providing EEPROM
> access, and only to user space.  This is a real need in the white box
> switch environment, and is not met by sfp.c.  optoe isn't better, sfp.c
> isn't better, they're just different.

sfp exposes standard ethtool hooks such as get_module_info() which pass
the whole data blob to user-space, e.g: ethtool where all of this is
better interpreted.

> 
> Finally, sfp.c does not recognize that SFP devices have data beyond 512
> bytes, accessible via a page register.  It also does not recognize QSFP
> devices at all.  QSFP devices have only 256 bytes accessible (one i2c
> address) before switching to paged access for the remaining data.  The
> first design requirement for optoe was to access all the available
> pages, because there is information and controls that we (optics
> vendors) want to make available to network management applications.

Patches welcome if you wish to extend sfp.c to support QSFP devices for
instances.

> 
> If sfp.c creates a standard linux i2c client for each SFP device, it
> should be possible to create an optoe managed device 'under' sfp.c to
> provide access to the full EEPROM address space:

It's the other way around, SFP relies on a standard Linux i2c bus master
to exist such that it can read the EEPROM from the standard slave
address location, same goes with a possibly present PHY.

>   # echo optoe2 0x50 >/sys/bus/i2c/devices/i2c-xx/new_device
> This might prove useful to user space consumers of that data.  We could
> also easily add a kernel API (eg the nvmem framework) to optoe to provide
> kernel access.  In other words, sfp.c could assign EEPROM management to
> optoe, while managing the electrical interfaces.  (This is actually
> pretty close to how the platfom drivers work in the switch environment.)
> sfp.c would get SFP page support and QSFP EEPROM access 'for free'.

That sounds like a possibly good approach.

> 
>>                       There is also a Linux standard API for getting
>> access to this information. ethtool -m/--module-info. Anything which
>> is exporting {Q}SFP data needs to use this API.
> 
> optoe simply provides direct access from user space to the full EEPROM
> data.  There is more data there than ethtool knows about, and in some
> devices there are proprietary registers that ethtool will never know
> about.  optoe does not interpret any of the EEPROM content (except the
> bare minimum to access pages correctly).  optoe also does not get in the
> way of ethtool.  It could prove to be a handy way for ethtool to access
> new EEPROM fields in the future.  QSFP-DD/OSFP are coming soon, they
> will have a different (incompatible) set of new fields to be decoded.

sfp is the same it only passes the EEPROM information to user-space and
interprets just what it needs to get the work done.

> 
> Bottom Line:  sfp.c is not a useful starting point for the switch
> environment I'm working in.  The underlying hardware architecture is
> quite different.  optoe is not a competing alternative.  Its only
> function is to provide user-space access to the EEPROM data in {Q}SFP
> devices.

I just don't understand why would we want optoe when the standard way to
expose both EEPROM and diagnostics modules has been through ethtool's
get_module_info since the basic paradigm is that a network port is a
net_device instance in the kernel. If that basic device driver model
does not exist, then it is unclear to me what are the benefits.

Would I be completely wrong if I wrote that you are likely working with
a switch SDK which primarily runs in user-space and so with lack of a
proper kernel-based device driver for your piece of hardware you are
attempting to create a driver which is some sort of a "tap" for some
specific portion of that larger hardware block?
-- 
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ