[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>
Date: Thu, 31 Aug 2017 09:57:58 -0700
From: Florian Fainelli <f.fainelli@...il.com>
To: David Daney <ddaney.cavm@...il.com>,
Marc Gonzalez <marc_gonzalez@...madesigns.com>
Cc: netdev <netdev@...r.kernel.org>,
Geert Uytterhoeven <geert+renesas@...der.be>,
David Miller <davem@...emloft.net>,
Andrew Lunn <andrew@...n.ch>, Mans Rullgard <mans@...sr.com>,
Mason <slash.tmp@...e.fr>
Subject: Re: [PATCH net] Revert "net: phy: Correctly process PHY_HALTED in
phy_stop_machine()"
On 08/31/2017 09:36 AM, David Daney wrote:
> On 08/31/2017 05:29 AM, Marc Gonzalez wrote:
>> On 31/08/2017 02:49, Florian Fainelli wrote:
>>
>>> This reverts commit 7ad813f208533cebfcc32d3d7474dc1677d1b09a ("net: phy:
>>> Correctly process PHY_HALTED in phy_stop_machine()") because it is
>>> creating the possibility for a NULL pointer dereference.
>>>
>>> David Daney provide the following call trace and diagram of events:
>>>
>>> When ndo_stop() is called we call:
>>>
>>> phy_disconnect()
>>> +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;
>>
>> What does this mean?
>
> I meant that after the call to phy_stop_interrupts(), phydev->irq =
> PHY_POLL;
>
>
>>
>> On the contrary, phy_stop_interrupts() is only called when *not* polling.
>
> That is the case I have. We are using interrupts from the phy.
>
>
>>
>> if (phydev->irq > 0)
>> phy_stop_interrupts(phydev);
>>
>>> +---> phy_stop_machine()
>>> | +---> phy_state_machine()
>>> | +----> queue_delayed_work(): Work queued.
>>
>> You're referring to the fact that, at the end of phy_state_machine()
>> (in polling mode) the code reschedules itself through:
>>
>> if (phydev->irq == PHY_POLL)
>> queue_delayed_work(system_power_efficient_wq,
>> &phydev->state_queue, PHY_STATE_TIME * HZ);
>
> Exactly. The call to phy_disconnect() ensures that there are no more
> interrupts and also that phydev->irq = PHY_POLL
>
> The call to cancel_delayed_work_sync() at the top of phy_stop_machine()
> was meant to ensure that phy_state_machine() was never run again. No
> interrupts + no queued work means that it should be save to do...
>
>>
>>> +--->phy_detach() implies: phydev->attached_dev = NULL;
>
> The problem is that by calling phy_state_machine() again (which the
> offending patch added) we now have work scheduled that will try to
> dereference the pointer that was set to NULL as a result of the
> phy_detach()
And the race is between phy_detach() setting phydev->attached_dev = NULL
and phy_state_machine() running in PHY_HALTED state and calling
netif_carrier_off().
>
>
>>>
>>> Now at a later time the queued work does:
>>>
>>> phy_state_machine()
>>> +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL:
>>
>> I tested a sequence of 500 link up / link down in polling mode,
>> and saw no such issue. Race condition?
>>
>
> You were lucky.
I too tested this a number of times on a 2 core and 4 core system, but
the race is there, both of us just were lucky enough we did not see any
crash. I suspect the race is easier to reproduce on a (at least 12 core)
system with possibly a higher clock speed.
>
>> For what case in phy_state_machine() is netif_carrier_off()
>> being called? Surely not PHY_HALTED?
>>
>
> The phy can be in a variety of states. It is connected to something
> outside of the system that we don't control, so you cannot assume any
> particular state. We must have code that doesn't crash the system no
> matter what state the phy is in.
>
> I suspect, but have not checked, that the phy is in PHY_RUNNING. I
> think that means that because this patch turned the state machine back
> on, it will start transitioning through PHY_UP, PHY_AN, ... and
> eventually get to the crash we see because phydev->attached_dev = NULL
I actually think the PHY remains in PHY_HALTED but just re-schedules
itself and keeps being in PHY_HALTED again until a call to phy_resume or
phy_start() moves it back to another state. This is largely inefficient,
and we should look into using the patch I posted yesterday which would
prevent a re-schedule when moved to PHY_HALTED:
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index d0626bf5c540..78168e19bd5d 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1234,7 +1234,7 @@ void phy_state_machine(struct work_struct *work)
* PHY, if PHY_IGNORE_INTERRUPT is set, then we will be moving
* between states from phy_mac_interrupt()
*/
- if (phydev->irq == PHY_POLL)
+ if (phydev->irq == PHY_POLL && phydev->state != PHY_HALTED)
queue_delayed_work(system_power_efficient_wq,
&phydev->state_queue,
PHY_STATE_TIME * HZ);
}
--
Florian
Powered by blists - more mailing lists