[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20090804005256.8e03f1ba.akpm@linux-foundation.org>
Date: Tue, 4 Aug 2009 00:52:56 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: <jie.yang@...eros.com>
Cc: <davem@...emloft.net>, <netdev@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH]atl1c:Do not call cancel_work_sync from the work itself
On Tue, 4 Aug 2009 10:19:06 +0800 <jie.yang@...eros.com> wrote:
> Do not call cancel_work_sync from the work itself, for it my cause
> recursive locking.
> detail info:
> events/1/10 is trying to acquire lock:
> (&adapter->reset_task){+.+...}, at: [<c043e384>] __cancel_work_timer+0x80/0x187
>
> but task is already holding lock:
> (&adapter->reset_task){+.+...}, at: [<c043ed6f>] worker_thread+0x127/0x234
> other info that might help us debug this:
>
> 2 locks held by events/1/10:
> #0: (events){+.+.+.}, at: [<c043ed6f>] worker_thread+0x127/0x234
> #1: (&adapter->reset_task){+.+...}, at: [<c043ed6f>] worker_thread+0x127/0x234
>
> stack backtrace:
> Pid: 10, comm: events/1 Not tainted 2.6.31-rc2 #12
> Call Trace:
> [<c04519c9>] validate_chain+0x4ae/0xb26
> [<c0451e8d>] ? validate_chain+0x972/0xb26
> [<c0437f4b>] ? lock_timer_base+0x1f/0x3e
> [<c04526f8>] __lock_acquire+0x6b7/0x745
> [<c0452816>] lock_acquire+0x90/0xad
> [<c043e384>] ? __cancel_work_timer+0x80/0x187
> [<c043e3b1>] __cancel_work_timer+0xad/0x187
> [<c043e384>] ? __cancel_work_timer+0x80/0x187
> [<c044f6b3>] ? mark_held_locks+0x3d/0x58
> [<c0690855>] ? _spin_unlock_irqrestore+0x36/0x3c
> [<c044f7d5>] ? trace_hardirqs_on_caller+0x107/0x12f
> [<c044f808>] ? trace_hardirqs_on+0xb/0xd
> [<c0437fb2>] ? try_to_del_timer_sync+0x48/0x4f
> [<c043e4a2>] cancel_work_sync+0xa/0xc
> [<f8955f95>] atl1c_down+0x1f/0xde [atl1c]
> [<f8956955>] atl1c_reset_task+0x1f/0x31 [atl1c]
> [<c043edae>] worker_thread+0x166/0x234
> [<c043ed6f>] ? worker_thread+0x127/0x234
> [<f8956936>] ? atl1c_reset_task+0x0/0x31 [atl1c]
> [<c0441ac5>] ? autoremove_wake_function+0x0/0x33
> [<c043ec48>] ? worker_thread+0x0/0x234
> [<c0441a25>] kthread+0x69/0x70
> [<c04419bc>] ? kthread+0x0/0x70
> [<c04034b7>] kernel_thread_helper+0x7/0x10
>
> So when atl1c_reset_task be scheduled just set a flag in ctrl_flag,
> to demonstrate it is in reset_task, when atl1c_down is call it will not call
> cancel_work_sync(&adapter->reset_task) if it sees the flag.
>
> Signed-off-by: jie yang <jie.yang@...eros.com>
> ---
> diff --git a/drivers/net/atl1c/atl1c.h b/drivers/net/atl1c/atl1c.h
> index 2a1120a..53242dc 100644
> --- a/drivers/net/atl1c/atl1c.h
> +++ b/drivers/net/atl1c/atl1c.h
> @@ -427,6 +427,7 @@ struct atl1c_hw {
> #define ATL1C_ASPM_CTRL_MON 0x0200
> #define ATL1C_HIB_DISABLE 0x0400
> #define ATL1C_LINK_CAP_1000M 0x0800
> +#define ATL1C_RESET_IN_WORK 0x1000
> #define ATL1C_FPGA_VERSION 0x8000
> u16 cmb_tpd;
> u16 cmb_rrd;
> diff --git a/drivers/net/atl1c/atl1c_main.c b/drivers/net/atl1c/atl1c_main.c
> index 1d601ce..dec88fa 100644
> --- a/drivers/net/atl1c/atl1c_main.c
> +++ b/drivers/net/atl1c/atl1c_main.c
> @@ -321,7 +321,10 @@ static void atl1c_del_timer(struct atl1c_adapter *adapter)
>
> static void atl1c_cancel_work(struct atl1c_adapter *adapter)
> {
> - cancel_work_sync(&adapter->reset_task);
> + if (adapter->hw.ctrl_flags & ATL1C_RESET_IN_WORK)
> + adapter->hw.ctrl_flags &= ~ATL1C_RESET_IN_WORK;/* clear the flag */
> + else
> + cancel_work_sync(&adapter->reset_task);
> cancel_work_sync(&adapter->link_chg_task);
> }
>
> @@ -1544,6 +1547,7 @@ static irqreturn_t atl1c_intr(int irq, void *data)
> /* reset MAC */
> hw->intr_mask &= ~ISR_ERROR;
> AT_WRITE_REG(hw, REG_IMR, hw->intr_mask);
> + adapter->hw.ctrl_flags |= ATL1C_RESET_IN_WORK;
> schedule_work(&adapter->reset_task);
> break;
> }
The use of non-atomic bit operations upon ctrl_flags looks unsafe. If
a CPU is running atl1c_cancel_work() and then an interrupt happens (on
this CPU or a different one), then the interrupt will set
ATL1C_RESET_IN_WORK. But the non-atomic operation in
atl1c_cancel_work() can just overwrite that change.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists