lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFHy5LAQyL2JW1Lox67OSz2WuRnzhVgSk6-0hfHf=gG2fXYmRQ@mail.gmail.com>
Date:   Mon, 8 Jul 2019 23:16:02 -0400
From:   kwangdo yi <kwangdo.yi@...il.com>
To:     Florian Fainelli <f.fainelli@...il.com>
Cc:     netdev@...r.kernel.org, Andrew Lunn <andrew@...n.ch>,
        Heiner Kallweit <hkallweit1@...il.com>
Subject: Re: [PATCH] phy: added a PHY_BUSY state into phy_state_machine

I simply fixed this issue by increasing the polling time from 20 msec to
60 msec in Xilinx EMAC driver. But the state machine would be in a
better shape if it is capable of handling sub system driver's fake failure.
PHY device driver could advertising the min/max timeouts for its subsystem,
but still some vendor's EMAC driver fails to meet the deadline if this value
is not set properly in PHY driver.

On Sun, Jul 7, 2019 at 11:07 PM Florian Fainelli <f.fainelli@...il.com> wrote:
>
> +Andrew, Heiner (please CC PHY library maintainers).
>
> On 7/7/2019 3:32 PM, kwangdo.yi wrote:
> > When mdio driver polling the phy state in the phy_state_machine,
> > sometimes it results in -ETIMEDOUT and link is down. But the phy
> > is still alive and just didn't meet the polling deadline.
> > Closing the phy link in this case seems too radical. Failing to
> > meet the deadline happens very rarely. When stress test runs for
> > tens of hours with multiple target boards (Xilinx Zynq7000 with
> > marvell 88E1512 PHY, Xilinx custom emac IP), it happens. This
> > patch gives another chance to the phy_state_machine when polling
> > timeout happens. Only two consecutive failing the deadline is
> > treated as the real phy halt and close the connection.
>
> How about simply increasing the MDIO polling timeout in the Xilinx EMAC
> driver instead? Or if the PHY is where the timeout needs to be
> increased, allow the PHY device drivers to advertise min/max timeouts
> such that the MDIO bus layer can use that information?
>
> >
> >
> > Signed-off-by: kwangdo.yi <kwangdo.yi@...il.com>
> > ---
> >  drivers/net/phy/phy.c | 6 ++++++
> >  include/linux/phy.h   | 1 +
> >  2 files changed, 7 insertions(+)
> >
> > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> > index e888542..9e8138b 100644
> > --- a/drivers/net/phy/phy.c
> > +++ b/drivers/net/phy/phy.c
> > @@ -919,7 +919,13 @@ void phy_state_machine(struct work_struct *work)
> >               break;
> >       case PHY_NOLINK:
> >       case PHY_RUNNING:
> > +     case PHY_BUSY:
> >               err = phy_check_link_status(phydev);
> > +             if (err == -ETIMEDOUT && old_state == PHY_RUNNING) {
> > +                     phy->state = PHY_BUSY;
> > +                     err = 0;
> > +
> > +             }
> >               break;
> >       case PHY_FORCING:
> >               err = genphy_update_link(phydev);
> > diff --git a/include/linux/phy.h b/include/linux/phy.h
> > index 6424586..4a49401 100644
> > --- a/include/linux/phy.h
> > +++ b/include/linux/phy.h
> > @@ -313,6 +313,7 @@ enum phy_state {
> >       PHY_RUNNING,
> >       PHY_NOLINK,
> >       PHY_FORCING,
> > +     PHY_BUSY,
> >  };
> >
> >  /**
> >
>
> --
> Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ