[<prev] [next>] [day] [month] [year] [list]
Message-ID: <OFB11D8F46.84F3CAC0-ONC12577D7.005C8EE4-C12577D7.005D3148@transmode.se>
Date: Wed, 10 Nov 2010 17:57:54 +0100
From: Joakim Tjernlund <joakim.tjernlund@...nsmode.se>
To: unlisted-recipients:; (no To-header on input)
Cc: Anton Vorontsov <avorontsov@...mvista.com>, netdev@...r.kernel.org
Subject: Re: [PATCH] ucc_geth: Fix hung tasks.
Joakim Tjernlund/Transmode wrote on 2010/11/10 15:11:22:
>
> Actually, there is something wrong anyway with TX timeout
> so don't use this patch. I must investigate more but
> it seems like cancel_work_sync hangs whenever an TX timeout
> occurs.
OK, found the problem. Currently ucc_geth bring the IF down and up
each time a TX timeout occurs which means you cannot do cancel_work_sync()
in ucc_geth_close as it will dead lock.
Looking at gianfar, it just reinits the controller and PHY and
I guess ucc_geth really should do the same.
This patch tries to do that but I am not sure it recovers
after a TX timeout.
Anton, what do think? If OK with you I will write up
a proper patch.
diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 6647ed7..133aaba 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -2065,9 +2065,6 @@ static void ucc_geth_stop(struct ucc_geth_private *ugeth)
/* Disable Rx and Tx */
clrbits32(&ug_regs->maccfg1, MACCFG1_ENABLE_RX | MACCFG1_ENABLE_TX);
- phy_disconnect(ugeth->phydev);
- ugeth->phydev = NULL;
-
ucc_geth_memclean(ugeth);
}
@@ -3558,6 +3555,8 @@ static int ucc_geth_close(struct net_device *dev)
cancel_work_sync(&ugeth->timeout_work);
ucc_geth_stop(ugeth);
+ phy_disconnect(ugeth->phydev);
+ ugeth->phydev = NULL;
free_irq(ugeth->ug_info->uf_info.irq, ugeth->ndev);
@@ -3586,8 +3585,12 @@ static void ucc_geth_timeout_work(struct work_struct *work)
* Must reset MAC *and* PHY. This is done by reopening
* the device.
*/
- ucc_geth_close(dev);
- ucc_geth_open(dev);
+ netif_tx_stop_all_queues(dev);
+ ucc_geth_stop(ugeth);
+ ucc_geth_init_mac(ugeth);
+ /* Must start PHY here */
+ phy_start(ugeth->phydev);
+ netif_tx_start_all_queues(dev);
}
netif_tx_schedule_all(dev);
>
> Joakim Tjernlund/Transmode wrote on 2010/11/10 13:05:28:
> >
> > Ping?
> >
> > Even though this patch didn't solve my hang it is still a bug.
> >
> > Jocke
> >
> > Joakim Tjernlund <Joakim.Tjernlund@...nsmode.se> wrote on 2010/11/08 11:23:39:
> >
> > > From: Joakim Tjernlund <Joakim.Tjernlund@...nsmode.se>
> > > To: linuxppc-dev@...ts.ozlabs.org, netdev@...r.kernel.org, Anton Vorontsov <avorontsov@...mvista.com>
> > > Cc: Joakim Tjernlund <Joakim.Tjernlund@...nsmode.se>
> > > Date: 2010/11/08 11:23
> > > Subject: [PATCH] ucc_geth: Fix hung tasks.
> > >
> > > We noticed a few hangs like this:
> > >
> > > INFO: task ifconfig:572 blocked for more than 120 seconds.
> > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > ifconfig D 0ff65760 0 572 369 0x00000000
> > > Call Trace:
> > > [c6157be0] [c6008460] 0xc6008460 (unreliable)
> > > [c6157ca0] [c0008608] __switch_to+0x4c/0x6c
> > > [c6157cb0] [c028fecc] schedule+0x184/0x310
> > > [c6157ce0] [c0290e54] __mutex_lock_slowpath+0xa4/0x150
> > > [c6157d20] [c0290c48] mutex_lock+0x44/0x48
> > > [c6157d30] [c01aba74] phy_stop+0x20/0x70
> > > [c6157d40] [c01aef40] ucc_geth_stop+0x30/0x98
> > > [c6157d60] [c01b18fc] ucc_geth_close+0x9c/0xdc
> > > [c6157d80] [c01db0cc] __dev_close+0xa0/0xd0
> > > [c6157d90] [c01deddc] __dev_change_flags+0x8c/0x148
> > > [c6157db0] [c01def54] dev_change_flags+0x1c/0x64
> > > [c6157dd0] [c0237ac8] devinet_ioctl+0x678/0x784
> > > [c6157e50] [c0239a58] inet_ioctl+0xb0/0xbc
> > > [c6157e60] [c01cafa8] sock_ioctl+0x174/0x2a0
> > > [c6157e80] [c009a16c] vfs_ioctl+0xcc/0xe0
> > > [c6157ea0] [c009a998] do_vfs_ioctl+0xc4/0x79c
> > > [c6157f10] [c009b0b0] sys_ioctl+0x40/0x74
> > > [c6157f40] [c00117c4] ret_from_syscall+0x0/0x38
> > >
> > > I THINK this is due to a missing cancel_work_sync in the driver
> > > although we cannot be sure. I found this by comparing
> > > ucc_geth with gianfar.
> > >
> > > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@...nsmode.se>
> > > ---
> > > drivers/net/ucc_geth.c | 1 +
> > > 1 files changed, 1 insertions(+), 0 deletions(-)
> > >
> > > diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
> > > index 97f9f7d..6647ed7 100644
> > > --- a/drivers/net/ucc_geth.c
> > > +++ b/drivers/net/ucc_geth.c
> > > @@ -3556,6 +3556,7 @@ static int ucc_geth_close(struct net_device *dev)
> > >
> > > napi_disable(&ugeth->napi);
> > >
> > > + cancel_work_sync(&ugeth->timeout_work);
> > > ucc_geth_stop(ugeth);
> > >
> > > free_irq(ugeth->ug_info->uf_info.irq, ugeth->ndev);
> > > --
> > > 1.7.2.2
> > >
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists