lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 12 Feb 2014 09:57:25 +0100
From:	Gerlando Falauto <gerlando.falauto@...mile.com>
To:	Florian Fainelli <f.fainelli@...il.com>
CC:	Matthew Garrett <matthew.garrett@...ula.com>,
	netdev <netdev@...r.kernel.org>,
	"devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Kishon Vijay Abraham I <kishon@...com>
Subject: Re: [PATCH V3] net/dt: Add support for overriding phy configuration
 from device tree

Hi Florian,

On 02/11/2014 06:43 PM, Florian Fainelli wrote:
> Hi Gerlando,
>
> 2014-02-11 1:09 GMT-08:00 Gerlando Falauto <gerlando.falauto@...mile.com>:
>> Hi Florian,
>>
>> first of all, thank you for your answer.
>>
>>
>> On 02/10/2014 06:09 PM, Florian Fainelli wrote:
>>>
>>> Hi Gerlando,
>>>
>>> Le lundi 10 février 2014, 17:14:59 Gerlando Falauto a écrit :
>>>>
>>>> Hi,
>>>>
>>>> I'm currently trying to fix an issue for which this patch provides a
>>>> partial solution, so apologies in advance for jumping into the
>>>> discussion for my own purposes...
>>>>
>>>> On 02/04/2014 09:39 PM, Florian Fainelli wrote:> 2014-01-17 Matthew
>>>>
>>>> Garrett <matthew.garrett@...ula.com>:
>>>>    >> Some hardware may be broken in interesting and board-specific ways,
>>>> such
>>>>    >> that various bits of functionality don't work. This patch provides a
>>>>    >> mechanism for overriding mii registers during init based on the
>>>>
>>>> contents of
>>>>
>>>>    >> the device tree data, allowing board-specific fixups without having
>>>> to
>>>>    >> pollute generic code.
>>>>    >
>>>>    > It would be good to explain exactly how your hardware is broken
>>>>    > exactly. I really do not think that such a fine-grained setting where
>>>>    > you could disable, e.g: 100BaseT_Full, but allow 100BaseT_Half to
>>>>    > remain usable makes that much sense. In general, Gigabit might be
>>>>    > badly broken, but 100 and 10Mbits/sec should work fine. How about the
>>>>    > MASTER-SLAVE bit, is overriding it really required?
>>>>    >
>>>>    > Is not a PHY fixup registered for a specific OUI the solution you are
>>>>    > looking for? I am also concerned that this creates PHY
>>>> troubleshooting
>>>>    > issues much harder to debug than before as we may have no idea about
>>>>    > how much information has been put in Device Tree to override that.
>>>>    >
>>>>    > Finally, how about making this more general just like the BCM87xx PHY
>>>>    > driver, which is supplied value/reg pairs directly? There are 16
>>>>    > common MII registers, and 16 others for vendor specific registers,
>>>>    > this is just covering for about 2% of the possible changes.
>>>>
>>>> Good point. That would easily help me with my current issue, which
>>>> requires autoneg to be disabled to begin with (by clearing BMCR_ANENABLE
>>>> from register 0).
>>>
>>>
>>> Is there a point in time (e.g: after some specific initial configuration
>>> has
>>> been made) where BMCR_ANENABLE can be used?
>>
>>
>> What do you mean? In my case, for some HW-related reason (due to the PHY
>> counterpart I guess) autoneg needs to be disabled.
>> This is currently done by the bootloader code (which clears the bit).
>> What I'm looking for is some way for the kernel to either reinforce this
>> setting, or just take that into account and skip autoneg.
>> On top of that, there's a HW errata about that particular PHY, which
>> requires certain operations to be performed on the PHY as a workaround *WHEN
>> AUTONEG IS DISABLED*. That I'd implement on a PHY-specif driver.
>
> Ok.
>
>>
>>
>>>> This would not however fix it entirely (I tried a quick hardwired
>>>> implementation), as the whole PHY machinery would not take that into
>>>> account and would re-enable autoneg anyway.
>>>> I also tried changing the patch so that phydev->support gets updated
>>>
>>>
>>> There are multiple things that you could try doing here:
>>>
>>> - override the PHY state machine in your read_status callback to make sure
>>> that you always set phydev->autoneg set to AUTONEG_ENABLE
>>
>>
>> [you mean AUTONEG_DISABLE, right?]
>
> Right, I fat fingered here.
>
>> Uhm, but I don't want to implement a driver for that PHY that always
>> disables autoneg. I only want to disable autoneg for that particular board.
>> I figure I might register a fixup for that board, but that kindof makes
>> everything more complicated and less clear. Plus, what should be the
>> criterion to determine whether we're running on that particular hardware?
>
> of_machine_is_compatible() plus reading the specific PHY OUI should
> provide you with with an unique machine + PHY tuple. If your machine
> name is too generic.

Uhm, actually, my machine name ("model") is specific, but the compatible 
string is indeed generic so this would mean adding an extra string 
there. Not that it's a big issue, but it just seems too complicated and 
hard to follow. After all, we wanted device tree in the first place to 
get rid of board-sepcific files. To me, filtering by machine name looks 
like a big step backwards, especially if it's all about a "pretty 
standard feature" like disabling autoneg.

>
>>
>>
>>> - clear the SUPPORTED_Autoneg bits from phydev->supported right after PHY
>>> registration and before the call to phy_start()
>>
>>
>> I actually tried clearing it by tweaking the patch on this thread, but the
>> end result is that it does not produce any effect (see further comments
>> below). Only thing that seems to play a role here is explictly setting
>> phydev->autoneg = AUTONEG_DISABLE.
>>
>>
>>> - set the PHY_HAS_MAGICANEG bit in your PHY driver flag
>>
>>
>> Again, this seems to play no role whatsoever here:
>>
>>                          } else if (0 == phydev->link_timeout--) {
>>                                  needs_aneg = 1;
>>                                  /* If we have the magic_aneg bit,
>>                                   * we try again */
>>                                  if (phydev->drv->flags & PHY_HAS_MAGICANEG)
>>                                          break;
>>                          }
>>                          break;
>>                  case PHY_NOLINK:
>>
>> This code might have made sense when it was written in 2006 -- back then,
>> the break statement was skipping some fallback code. But now it seems to do
>> nothing.
>>
>>
>>>
>>>>
>>>> (instead of phydev->advertising):
>>>>    >> +               if (!of_property_read_u32(np, override->prop, &tmp))
>>>> {
>>>>    >> +                       if (tmp) {
>>>>    >> +                               *val |= override->value;
>>>>    >> +                               phydev->advertising |=
>>>>
>>>> override->supported;
>>>>
>>>>    >> +                       } else {
>>>>    >> +                               phydev->advertising &=
>>>>
>>>> ~(override->supported);
>>>>
>>>>    >> +                       }
>>>>    >> +
>>>>    >> +                       *mask |= override->value;
>>>>
>>>> What I find weird is that the only way phydev->autoneg could ever be set
>>>> to disabled is from here (phy.c):
>>>>
>>>> static void phy_sanitize_settings(struct phy_device *phydev)
>>>> {
>>>>          u32 features = phydev->supported;
>>>>          int idx;
>>>>
>>>>          /* Sanitize settings based on PHY capabilities */
>>>>          if ((features & SUPPORTED_Autoneg) == 0)
>>>>                  phydev->autoneg = AUTONEG_DISABLE;
>>>>
>>>> which is in turn only called when phydev->autoneg is set to
>>>> AUTONEG_DISABLE to begin with:
>>>>
>>>> int phy_start_aneg(struct phy_device *phydev)
>>>> {
>>>>          int err;
>>>>
>>>>          mutex_lock(&phydev->lock);
>>>>
>>>>          if (AUTONEG_DISABLE == phydev->autoneg)
>>>>                  phy_sanitize_settings(phydev);
>>>>
>>>> So could someone please help me figure out what I'm missing here?
>>>
>>>
>>> At first glance it looks like the PHY driver should be reading the phydev-
>>>>
>>>> autoneg value when the PHY driver config_aneg() callback is called to be
>>>
>>> allowed to set the forced speed and settings.
>>>
>>> The way phy_sanitize_settings() is coded does not make it return a mask of
>>> features, but only the forced supported speed and duplex. Then when the
>>> link
>>> is forced but we are having some issues getting a link status, libphy
>>> tries
>>> lower speeds and this function is used again to provide the next
>>> speed/duplex
>>> pair to try.
>>>
>>
>> What I was trying to say is that phy_sanitize_settings() is only called when
>> phydev->autoneg == AUTONEG_DISABLE, and in turn it's the only generic
>> function setting phydev->autoneg = AUTONEG_DISABLE.
>> So perhaps the condition should read:
>>
>> -       if (AUTONEG_DISABLE == phydev->autoneg)
>> +       if ((features & SUPPORTED_Autoneg) == 0)
>>                  phy_sanitize_settings(phydev);
>>
>> Or else, some other parts of the generic code should take care of setting it
>> to AUTONEG_DISABLE, depending on whether the feature is supported or not.
>> What I found weird is explicitly setting a value (phydev->autoneg =
>> AUTONEG_DISABLE), from a static function which is only called when that
>> condition is already true.
>
> I do not think that this change is correct either, let me cook a patch
> for you to allow disabling autoneg from the start.

Oh, OK, that would be great, thank you!
FWIW, I've already spent quite some time trying to overcome this -- my 
understanding is that you somehow need to set phydev->autoneg to 
AUTONEG_DISABLE at a very early stage (and that could of course be done 
as a consequence of SUPPORTED_Autoneg being unset), otherwise the whole 
software phy state machine and speed-matching algorithms will get confused.

>>
>> BTW, I feel like disabling autoneg from the start has never been a use case
>> before, am I right?
>
> Not really no, and that is because most hardware does not need quirks
> to work correctly.

To be honest with you, I'm not long experienced on MII/PHY, but I've 
already seen two completely unrelated cases where autoneg needs to be 
disabled in order for the hardware to work correctly. Of course I'm only 
talking about in-board connections (e.g. not PHYs connected to an RJ-45 
jack), still...
In this particular hardware configuration, not only does autoneg need to 
be disabled in the first place (otherwise link won't work at all), but 
the phy HW is also buggy so that when autoneg is disabled, it may still 
occasionally not work (like 0.1% of the times).

Thank you so much!
Gerlando
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists