[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b20f0964-42b7-53af-fe24-540d6cd011de@nvidia.com>
Date: Mon, 29 Aug 2022 14:21:35 +0300
From: Gal Pressman <gal@...dia.com>
To: "Keller, Jacob E" <jacob.e.keller@...el.com>,
Jakub Kicinski <kuba@...nel.org>
Cc: Saeed Mahameed <saeedm@...dia.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Simon Horman <horms@...ge.net.au>,
Andy Gospodarek <andy@...yhouse.net>
Subject: Re: [PATCH net-next 0/2] ice: support FEC automatic disable
On 29/08/2022 10:11, Keller, Jacob E wrote:
>
>> -----Original Message-----
>> From: Gal Pressman <gal@...dia.com>
>> Sent: Sunday, August 28, 2022 3:43 AM
>> To: Jakub Kicinski <kuba@...nel.org>; Keller, Jacob E <jacob.e.keller@...el.com>
>> Cc: Saeed Mahameed <saeedm@...dia.com>; netdev@...r.kernel.org; Simon
>> Horman <horms@...ge.net.au>; Andy Gospodarek <andy@...yhouse.net>
>> Subject: Re: [PATCH net-next 0/2] ice: support FEC automatic disable
>>
>> On 27/08/2022 02:57, Jakub Kicinski wrote:
>>> On Fri, 26 Aug 2022 10:51:21 -0700 Jacob Keller wrote:
>>>> On 8/25/2022 6:01 PM, Jakub Kicinski wrote:
>>>>> Oh, but per the IEEE standard No FEC is _not_ an option for CA-L.
>>>>> From the initial reading of your series I thought that Intel NICs
>>>>> would _never_ pick No FEC.
>>>> That was my original interpretation when I was first introduced to this
>>>> problem but I was mistaken, hence why the commit message wasn't clear :(
>>>>
>>>> This is rather more complicated than I originally understood and the
>>>> names for various bits have not been named very well so their behavior
>>>> isn't exactly obvious...
>>>>
>>>>> Sounds like we need a bit for "ignore the standard and try everything".
>>>>>
>>>>> What about BASE-R FEC? Is the FW going to try it on the CA-L cable?
>>>> Ok I got further clarification on this. We have a bit, "Auto FEC
>>>> enable", as well as a bitmask for which FEC modes to try.
>>>>
>>>> If "Auto FEC En" is set, then the Link Establishment State Machine will
>>>> try all of the FEC options we list in that bitmask, as long as we can
>>>> theoretically support them even if they aren't spec compliant.
>>>>
>>>> For old firmware the bitmask didn't include a bit for "No FEC", where as
>>>> the new firmware has a bit for "No FEC".
>>>>
>>>> We were always setting "Auto FEC En" so currently we try all FEC modes
>>>> we could theoretically support.
>>>>
>>>> If "Auto FEC En" is disabled, then we only try FEC modes which are spec
>>>> compliant. Additionally, only a single FEC mode is tried based on a
>>>> priority and the bitmask.
>>>>
>>>> Currently and historically the driver has always set "Auto FEC En", so
>>>> we were enabling non-spec compliant FEC modes, but "No FEC" was only
>>>> based on spec compliance with the media type.
>>>>
>>>> From this, I think I agree the correct behavior is to add a bit for
>>>> "override the spec and try everything", and then on new firmware we'd
>>>> set the "No FEC" while on old firmware we'd be limited to only trying
>>>> FEC modes.
>>>>
>>>> Does that make sense?
>>>>
>>>> So yea I think we do probably need a "ignore the standard" bit.. but
>>>> currently that appears to already be what ice does (excepting No FEC
>>>> which didn't previously have a bit to set for it)
>>> Thanks for getting to the bottom of this :)
>>>
>>> The "override spec modes" bit sounds like a reasonable addition,
>>> since we possibly have different behavior between vendors letting
>>> the user know if the device will follow the rules can save someone
>>> debugging time.
>>>
>>> But it does sound orthogonal to you adding the No FEC bit to the mask
>>> for ice.
>>>
>>> Let me add Simon and Andy so that we have the major vendors on the CC.
>>> (tl;dr the question is whether your FW follows the guidance of
>>> 'Table 110C–1—Host and cable assembly combinations' in AUTO FEC mode).
>>>
> The other engineers I spoke to also wanted to mention that 110C-1 is only a small subset of all of the various link types. They also mentioned something about an SFF standard which describes many more types.
The firmware folks here claim there's also a difference between TX and RX.
>
>>> If all the vendors already ignore the standard (like Intel and AFAIU
>>> nVidia) then we just need to document, no point adding the bit...
>> I think we misunderstood each other :).
>> Our implementation definitely *does not* ignore the standard.
>> When autoneg is disabled, and auto fec is enabled, the firmware chooses
>> a fec mode according to the spec. If "no FEC" is not in the spec, we
>> will not pick it (nor do we want to).
>> It sounds like you're not happy with the spec, then why not change it?
>> This doesn't sound like an area where we want to be non-compliant.
>>
>> Regardless, changing our interface because of one device' firmware
>> bug/behavior change doesn't make any sense. This interface affects all
>> consumers, and is going to stick with us forever. Firmware will
>> eventually get updated and it only affects one driver.
> Well, the current ice behavior for every FEC mode *except* No FEC, we try modes which may be supportable even though they're outside the spec. As far as I understand, the reason we attempt these is because it allows linking in more scenarios, presumably because some combination of things is not fully spec compliant? I don't really know for sure.
>
> For future firmware, this would include No FEC as well. Thus, even with future firmware we'd still be trying some modes outside of the spec. I can try to get some more answers tomorrow about the reasoning and justification for this behavior.
>
Yea, understood, but respectfully, I don't understand why we should go
along with your requirement to support this non-spec behavior.
Powered by blists - more mailing lists