netdev - Re: [PATCH net-next v3 01/47] dt-bindings: phy: Add Lynx 10G phy binding

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6240dce3-3b68-2df4-768e-ca82bcea518f@seco.com>
Date:   Thu, 21 Jul 2022 19:35:15 -0400
From:   Sean Anderson <sean.anderson@...o.com>
To:     Rob Herring <robh@...nel.org>
Cc:     "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Madalin Bucur <madalin.bucur@....com>,
        netdev <netdev@...r.kernel.org>, Paolo Abeni <pabeni@...hat.com>,
        Eric Dumazet <edumazet@...gle.com>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        Russell King <linux@...linux.org.uk>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Kishon Vijay Abraham I <kishon@...com>,
        Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
        Vinod Koul <vkoul@...nel.org>, devicetree@...r.kernel.org,
        "open list:GENERIC PHY FRAMEWORK" <linux-phy@...ts.infradead.org>
Subject: Re: [PATCH net-next v3 01/47] dt-bindings: phy: Add Lynx 10G phy
 binding

On 7/21/22 2:29 PM, Rob Herring wrote:
> On Thu, Jul 21, 2022 at 10:06 AM Sean Anderson <sean.anderson@...o.com> wrote:
>>
>>
>>
>> On 7/20/22 6:17 PM, Rob Herring wrote:
>> > On Fri, Jul 15, 2022 at 05:59:08PM -0400, Sean Anderson wrote:
>> >> This adds a binding for the SerDes module found on QorIQ processors. The
>> >> phy reference has two cells, one for the first lane and one for the
>> >> last. This should allow for good support of multi-lane protocols when
>> >> (if) they are added. There is no protocol option, because the driver is
>> >> designed to be able to completely reconfigure lanes at runtime.
>> >> Generally, the phy consumer can select the appropriate protocol using
>> >> set_mode. For the most part there is only one protocol controller
>> >> (consumer) per lane/protocol combination. The exception to this is the
>> >> B4860 processor, which has some lanes which can be connected to
>> >> multiple MACs. For that processor, I anticipate the easiest way to
>> >> resolve this will be to add an additional cell with a "protocol
>> >> controller instance" property.
>> >>
>> >> Each serdes has a unique set of supported protocols (and lanes). The
>> >> support matrix is configured in the device tree. The format of each
>> >> PCCR (protocol configuration register) is modeled. Although the general
>> >> format is typically the same across different SoCs, the specific
>> >> supported protocols (and the values necessary to select them) are
>> >> particular to individual SerDes. A nested structure is used to reduce
>> >> duplication of data.
>> >>
>> >> There are two PLLs, each of which can be used as the master clock for
>> >> each lane. Each PLL has its own reference. For the moment they are
>> >> required, because it simplifies the driver implementation. Absent
>> >> reference clocks can be modeled by a fixed-clock with a rate of 0.
>> >>
>> >> Signed-off-by: Sean Anderson <sean.anderson@...o.com>
>> >> ---
>> >>
>> >> Changes in v3:
>> >> - Manually expand yaml references
>> >> - Add mode configuration to device tree
>> >>
>> >> Changes in v2:
>> >> - Rename to fsl,lynx-10g.yaml
>> >> - Refer to the device in the documentation, rather than the binding
>> >> - Move compatible first
>> >> - Document phy cells in the description
>> >> - Allow a value of 1 for phy-cells. This allows for compatibility with
>> >>   the similar (but according to Ioana Ciornei different enough) lynx-28g
>> >>   binding.
>> >> - Remove minItems
>> >> - Use list for clock-names
>> >> - Fix example binding having too many cells in regs
>> >> - Add #clock-cells. This will allow using assigned-clocks* to configure
>> >>   the PLLs.
>> >> - Document the structure of the compatible strings
>> >>
>> >>  .../devicetree/bindings/phy/fsl,lynx-10g.yaml | 311 ++++++++++++++++++
>> >>  1 file changed, 311 insertions(+)
>> >>  create mode 100644 Documentation/devicetree/bindings/phy/fsl,lynx-10g.yaml
>> >>
>> >> diff --git a/Documentation/devicetree/bindings/phy/fsl,lynx-10g.yaml b/Documentation/devicetree/bindings/phy/fsl,lynx-10g.yaml
>> >> new file mode 100644
>> >> index 000000000000..a2c37225bb67
>> >> --- /dev/null
>> >> +++ b/Documentation/devicetree/bindings/phy/fsl,lynx-10g.yaml
>> >> @@ -0,0 +1,311 @@
>> >> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>> >> +%YAML 1.2
>> >> +---
>> >> +$id: http://devicetree.org/schemas/phy/fsl,lynx-10g.yaml#
>> >> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> >> +
>> >> +title: NXP Lynx 10G SerDes
>> >> +
>> >> +maintainers:
>> >> +  - Sean Anderson <sean.anderson@...o.com>
>> >> +
>> >> +description: |
>> >> +  These Lynx "SerDes" devices are found in NXP's QorIQ line of processors. The
>> >> +  SerDes provides up to eight lanes. Each lane may be configured individually,
>> >> +  or may be combined with adjacent lanes for a multi-lane protocol. The SerDes
>> >> +  supports a variety of protocols, including up to 10G Ethernet, PCIe, SATA, and
>> >> +  others. The specific protocols supported for each lane depend on the
>> >> +  particular SoC.
>> >> +
>> >> +definitions:
>> >
>> > $defs:
>>
>> That didn't work until recently :)
>>
>> I will change this for next revision.
>>
>> >> +  fsl,cfg:
>> >> +    $ref: /schemas/types.yaml#/definitions/uint32
>> >> +    minimum: 1
>> >> +    description: |
>> >> +      The configuration value to program into the field.
>> >
>> > What field?
>>
>> Ah, looks like this lost some context when I moved it. I will expand on this.
>>
>> >> +
>> >> +  fsl,first-lane:
>> >> +    $ref: /schemas/types.yaml#/definitions/uint32
>> >> +    minimum: 0
>> >> +    maximum: 7
>> >> +    description: |
>> >> +      The first lane in the group configured by fsl,cfg. This lane will have
>> >> +      the FIRST_LANE bit set in GCR0. The reset direction will also be set
>> >> +      based on whether this property is less than or greater than
>> >> +      fsl,last-lane.
>> >> +
>> >> +  fsl,last-lane:
>> >> +    $ref: /schemas/types.yaml#/definitions/uint32
>> >> +    minimum: 0
>> >> +    maximum: 7
>> >> +    description: |
>> >> +      The last lane configured by fsl,cfg. If this property is absent,
>> >> +      then it will default to the value of fsl,first-lane.
>> >> +
>> >> +properties:
>> >> +  compatible:
>> >> +    items:
>> >> +      - enum:
>> >> +          - fsl,ls1046a-serdes
>> >> +          - fsl,ls1088a-serdes
>> >> +      - const: fsl,lynx-10g
>> >> +
>> >> +  "#clock-cells":
>> >> +    const: 1
>> >> +    description: |
>> >> +      The cell contains the index of the PLL, starting from 0. Note that when
>> >> +      assigning a rate to a PLL, the PLLs' rates are divided by 1000 to avoid
>> >> +      overflow. A rate of 5000000 corresponds to 5GHz.
>> >> +
>> >> +  "#phy-cells":
>> >> +    minimum: 1
>> >> +    maximum: 2
>> >> +    description: |
>> >> +      The cells contain the following arguments:
>> >> +      - The first lane in the group. Lanes are numbered based on the register
>> >> +        offsets, not the I/O ports. This corresponds to the letter-based ("Lane
>> >> +        A") naming scheme, and not the number-based ("Lane 0") naming scheme. On
>> >> +        most SoCs, "Lane A" is "Lane 0", but not always.
>> >> +      - Last lane. For single-lane protocols, this should be the same as the
>> >> +        first lane.
>> >
>> > Perhaps a single cell with a lane mask would be simpler.
>>
>> Yes.
>>
>> >> +      If no lanes in a SerDes can be grouped, then #phy-cells may be 1, and the
>> >> +      first cell will specify the only lane in the group.
>> >
>> > It is generally easier to have a fixed number of cells.
>>
>> This was remarked on last time. I allowed this for better compatibility with the lynx
>> 28g serdes binding. Is that reasonable? I agree it would simplify the driver to just
>> have one cell type.
>>
>> >> +
>> >> +  clocks:
>> >> +    maxItems: 2
>> >> +    description: |
>> >> +      Clock for each PLL reference clock input.
>> >> +
>> >> +  clock-names:
>> >> +    minItems: 2
>> >> +    maxItems: 2
>> >> +    items:
>> >> +      enum:
>> >> +        - ref0
>> >> +        - ref1
>> >> +
>> >> +  reg:
>> >> +    maxItems: 1
>> >> +
>> >> +patternProperties:
>> >> +  '^pccr-':
>> >> +    type: object
>> >> +
>> >> +    description: |
>> >> +      One of the protocol configuration registers (PCCRs). These contains
>> >> +      several fields, each of which mux a particular protocol onto a particular
>> >> +      lane.
>> >> +
>> >> +    properties:
>> >> +      fsl,pccr:
>> >> +        $ref: /schemas/types.yaml#/definitions/uint32
>> >> +        description: |
>> >> +          The index of the PCCR. This is the same as the register name suffix.
>> >> +          For example, a node for PCCRB would use a value of '0xb' for an
>> >> +          offset of 0x22C (0x200 + 4 * 0xb).
>> >> +
>> >> +    patternProperties:
>> >> +      '^(q?sgmii|xfi|pcie|sata)-':
>> >> +        type: object
>> >> +
>> >> +        description: |
>> >> +          A configuration field within a PCCR. Each field configures one
>> >> +          protocol controller. The value of the field determines the lanes the
>> >> +          controller is connected to, if any.
>> >> +
>> >> +        properties:
>> >> +          fsl,index:
>> >
>> > indexes are generally a red flag in binding. What is the index, how does
>> > it correspond to the h/w and why do you need it.
>>
>> As described in the description below, the "index" is the protocol controller suffix,
>> corresponding to a particular field (or set of fields) in the protocol configuration
>> registers.
>>
>> > If we do end up needing
>> > it, 'reg' is generally how we address some component.
>>
>> I originally used reg, but I got warnings about inheriting #size-cells and
>> #address-cells. These bindings are already quite verbose to write out (there
>> are around 10-20 configurations per SerDes to describe) and I would like to
>> minimize the amount of properties to what is necessary. Additionally, this
>> really describes a particular index of a field, and not a register (or an offset
>> within a register).
> 
> Are you trying to describe all possible configurations in DT? Don't.
> The DT should be the config for the specific board, not a menu of
> possible configurations.

Reasons 2 and 3 mentioned below.

>> >> +            $ref: /schemas/types.yaml#/definitions/uint32
>> >> +            description: |
>> >> +              The index of the field. This corresponds to the suffix in the
>> >
>> > What field?
>>
>> The one from the description above.
>>
>> >> +              documentation. For example, PEXa would be 0, PEXb 1, etc.
>> >> +              Generally, higher fields occupy lower bits.
>> >> +
>> >> +              If there are any subnodes present, they will be preferred over
>> >> +              fsl,cfg et. al.
>> >> +
>> >> +          fsl,cfg:
>> >> +            $ref: "#/definitions/fsl,cfg"
>> >> +
>> >> +          fsl,first-lane:
>> >> +            $ref: "#/definitions/fsl,first-lane"
>> >> +
>> >> +          fsl,last-lane:
>> >> +            $ref: "#/definitions/fsl,last-lane"
>> >
>> > Why do you have lane assignments here and in the phy cells?
>>
>> For three reasons. First, because we need to know what protocols are valid on what
>> lanes. The idea is to allow the MAC to configure the protocols at runtime. To do
>> this, someone has to figure out if the protocol is supported on that lane. The
>> best place to put this IMO is the serdes.
> 
> Within ethernet protocols, that makes sense.
> 
>> Second, some serdes have (mostly) unsupported protocols such as PCIe as well as
>> Ethernet protocols. To allow using Ethernet, we need to know which lanes are
>> configured (by the firmware/bootloader) for some other protocol. That way, we
>> can avoid touching them.
> 
> The ones needed for ethernet are the ones with a connection to the
> ethernet MACs with the 'phys' properties. Why don't you just ignore
> the !ethernet ones?

That's what I try to do. However, non-ethernet modes can use the same
lanes as ethernet modes. So we need to know how the protocol selection
registers are laid out, and what bits select which lanes. Although the
general layout is mostly the same [1], the mapping is specific to the
individual serdes on the individual SoC.

[1] Occasionally, the layout of registers changes between different SoC
    revisions. Usually this is because one of the registers ran out of
    bits.

>> Third, as part of the probe sequence, we need to ensure that no protocol controllers
>> are currently selected. Otherwise, we will get strange problems later when we try
>> to connect multiple protocol controllers to the same lane.
> 
> Sounds like a kernel problem...

Of course, but this stuff has to come from somewhere. Due to the second
reason above we can't just clear out all the PCCRs. We need to know
whether a lane is in use or not, 

>>
>> >> +
>> >> +          fsl,proto:
>> >> +            $ref: /schemas/types.yaml#/definitions/string
>> >> +            enum:
>> >> +              - sgmii
>> >> +              - sgmii25
>> >> +              - qsgmii
>> >> +              - xfi
>> >> +              - pcie
>> >> +              - sata
>> >
>> > We have standard phy modes already for at least most of these types.
>> > Generally the mode is set in the phy cells.
>>
>> Yes, but this is the "protocol" which may correspond to multiple phy modes.
>> For example, sgmii25 allows SGMII, 1000BASE-X, 1000BASE-KR, and 2500BASE-X
>> phy modes.
> 
> As phy mode is more specific than protocol (or mode implies protocol),
> why do we need protocol in DT?

The protocol (along with the PCCR and the protocol controller index) is
used to determine the bitmask for a particular selector. For example,
PCCR1 on the T1040 has the following layout:

Bits  Field name
===== ==========
 0- 1 SGMIIA_CFG
 2- 3 SGMIIB_CFG
 4- 5 SGMIIC_CFG
 6- 7 SGMIID_CFG
 8- 9 SGMIIE_CFG
10-11 SGMIIF_CFG
12-15 Reserved
   16 SGMIIA_KX
   17 SGMIIB_KX
   18 SGMIIC_KX
   19 SGMIID_KX
   20 SGMIIE_KX
   21 SGMIIF_KX
22-23 Reserved
24-25 QSGMA_CFG
26-27 Reserved
28-29 QSGMB_CFG
30-31 Reserved

Note that the KX bit (determining whether 1000BASE-X/SGMII or
1000BASE-KX is enabled) is not contiguous with the CFG field. Instead,
the "index" of the protocol controller is used to determine the correct
max to use for the CFG field as well as the KX bit. Also note that this
register is inhomogeneous. Just the "index" is not enough: we need to
know what the protocol is as well.

> [...]
> 
>> >> +        xfi-1 {
>> >> +          fsl,index = <1>;
>> >> +          fsl,cfg = <0x1>;
>> >> +          fsl,first-lane = <0>;
>> >> +          fsl,proto = "xfi";
>> >> +        };
>> >> +      };
>> >> +    };
>> >
>> > Other than lane assignments and modes, I don't really understand what
>> > you are trying to do.
>>
>> This is touched on a bit above, but the idea here is to allow for dynamic
>> reconfiguration of the serdes mode in order to support multiple ethernet
>> phy modes at runtime. To do this, we need to know about all the available
>> protocol controllers, and the lanes they support. In particular, the
>> available controllers and the lanes they map to (and the values to
>> program to select them) differ even between different serdes on the same
>> SoC.
>>
>> > It all looks too complex and I don't see any other
>> > phy bindings needing something this complex.
>>
>> This was explicitly asked for last time. I also would not like to do this,
>> but you and Krzysztof Kozlowski were very opposed to having per-device
>> compatible strings. If you have a suggestion for a different approach, I
>> am all ears. I find it very frustrating that the primary feedback I get from
>> the device tree folks is "you can't do this" without a corresponding "do it
>> this way."
> 
> How much time do you expect that we spend on your binding which is
> only 1 out of the 100-200 patches we get a week?

I appreciate the work you do on this. But every revision I make without
knowing whether I'm on the right track wastes both of our time. I have
to spend my time coming up with and implementing a new binding and you
have to spend time reviewing it. A nudge in the right direction can
easily save us both time.

> We're not experts in all kinds of h/w and the experts for specific h/w
> don't always care about DT bindings.

Vinod, this is why I (and presumably Rob) would appreciate your feedback.

> We often get presented with solutions without sufficient explanations
> of the problem. If I don't understand the problem, how can I propose a
> solution? We can only point out what doesn't fit within normal DT
> patterns. PHYs with multiple modes supported is not a unique problem,
> so why are existing ways to deal with that not sufficient and why do
> you need a *very* specific binding?

Well, take for example xlnx,zynqmp-psgtr. Although it is not obvious
from the binding, there are several things which simplify the driver.
First, all the modes are completely incompatible. Any consumer will not
need to switch modes at runtime. Second, there is only one GTR device
per SoC. That means that the compatible string which completely
determines the available modes. The mode/lane mapping can be stored in
the driver instead of in the device tree. Last, there is only one
variant of this device. There are no other SoCs with slightly different
register layout, mode support, or lanes.

To contrast with this device, there are several almost-compatible modes.
We cannot just set the mode at boot and be done with it (in fact this is
exactly what I am trying to change by adding a driver). Some modes are
so similar that they reuse protocol controllers, but they still need to
have different lane configuration. There are multiple different SerDes
devices on each SoC. While they have the same register layout, the
connected protocol controllers (and lane mapping) is different. There
are also different SoCs with (ever-so-slightly) different register
layouts, protocol controllers, and lane mappings.

All of this sort of information would normally just be stored in the
driver as a set of struct arrays. In fact, this is what I did the first
time!

> With the phy binding, you know what each lane is connected to. You can
> put whatever information you want in the phy cells to configure the
> phy for that client. The phy cells are defined by the provider and
> opaque to the consumer. Yes, we like to standardize cells when
> possible, but that's only a convenience. I'm not saying phy cells is
> the answer for everything and define 10 cells worth of data either.

Maybe it's better to do something like

	// first-lane last-lane protocol pccr idx val
	phys = <&serdes1 1 1 PHY_TYPE_SGMII 0x8 2 1>,
	       <&serdes1 1 1 PHY_TYPE_QSGMII 0x9 0 2>,
	       <&serdes1 1 1 PHY_TYPE_10GBASER 0xb 1 1>;
	phy-names = "sgmii", "qsgmii", "xfi";

(made up values)

But this doesn't play well with the existing idiom of being able to call
phy_set_mode(). Plus, existing drivers expect to have one (devicetree)
phy for one physical serdes.

What about

	phys = <&serdes1_lane1>;

and then under the serdes node do something like

	serdes1: phy@foo {
		...

		serdes1_lane1 {
			first-lane = <1>;

			sgmii {
				fsl,pccr = <0x8>;
				fsl,idx = <2>;
				fsl,cfg = <1>;
				fsl,proto = "sgmii";
				// or PHY_TYPE_SGMII
			};

			qsgmii {
				...
			};

			xfi {
				...
			};
		};
	};

and this way you could have something like a fsl,reserved property to
deal with not-yet-supported lanes. And this could be added piecemeal by
board configs.

--Sean