[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <934557d0-0c5a-76d5-e8f6-a8683210b0ed@amd.com>
Date: Tue, 15 Oct 2019 15:26:07 +0000
From: "Lendacky, Thomas" <Thomas.Lendacky@....com>
To: James Morse <james.morse@....com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC: "Dave S . Miller" <davem@...emloft.net>
Subject: Re: [RFC PATCH net 1/2] amd-xgbe: Avoid sleeping in flush_workqueue()
while holding a spinlock
On 10/15/19 8:49 AM, James Morse wrote:
> xgbe_powerdown() takes an irqsave spinlock, then calls flush_workqueue()
> which takes a mutex. DEBUG_ATOMIC_SLEEP isn't happy about this:
> | BUG: sleeping function called from invalid context at [...] mutex.c:281
> | in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 2733, name: bash
> | CPU: 4 PID: 2733 Comm: bash Tainted: G W 5.4.0-rc3 #113
> | Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT)
> | Call trace:
> | show_stack+0x24/0x30
> | dump_stack+0xb0/0xf8
> | ___might_sleep+0x124/0x148
> | __might_sleep+0x54/0x90
> | mutex_lock+0x2c/0x80
> | flush_workqueue+0x84/0x420
> | xgbe_powerdown+0x6c/0x108
> | xgbe_platform_suspend+0x34/0x80
> | pm_generic_freeze+0x3c/0x58
> | acpi_subsys_freeze+0x2c/0x38
> | dpm_run_callback+0x3c/0x1e8
> | __device_suspend+0x130/0x468
> | dpm_suspend+0x114/0x388
> | hibernation_snapshot+0xe8/0x378
> | hibernate+0x18c/0x2f8
>
> Drop the lock for the duration of xgbe_powerdown(). We have already
> stopeed the timers, so service_work can't be re-queued. Move the
> pdata->power_down flag earlier so that it can be used by the interrupt
> handler to know not to re-queue the tx_tstamp_work.
>
> Signed-off-by: James Morse <james.morse@....com>
>
> ---
> RFC as I'm not familiar with this driver. I'm happy to test a better fix!
> ---
> drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 22 ++++++++++++++++++----
> 1 file changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> index 98f8f2033154..bfba7effcf9f 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> @@ -480,6 +480,8 @@ static void xgbe_isr_task(unsigned long data)
> struct xgbe_channel *channel;
> unsigned int dma_isr, dma_ch_isr;
> unsigned int mac_isr, mac_tssr, mac_mdioisr;
> + unsigned long flags;
> + bool power_down;
> unsigned int i;
>
> /* The DMA interrupt status register also reports MAC and MTL
> @@ -558,8 +560,14 @@ static void xgbe_isr_task(unsigned long data)
> /* Read Tx Timestamp to clear interrupt */
> pdata->tx_tstamp =
> hw_if->get_tx_tstamp(pdata);
> - queue_work(pdata->dev_workqueue,
> - &pdata->tx_tstamp_work);
> +
> + spin_lock_irqsave(&pdata->lock, flags);
> + power_down = pdata->power_down;
> + spin_unlock_irqrestore(&pdata->lock, flags);
> +
> + if (!power_down)
> + queue_work(pdata->dev_workqueue,
> + &pdata->tx_tstamp_work);
Hmm, I think this could race. You probably want to protect queue_work()
with the spin lock, too, in which case you won't need the power_down var.
> }
> }
>
> @@ -1256,16 +1264,22 @@ int xgbe_powerdown(struct net_device *netdev, unsigned int caller)
>
> netif_tx_stop_all_queues(netdev);
>
> + /* Stop service_work being re-queued by the service_timer */
> xgbe_stop_timers(pdata);
> +
> + /* Stop tx_tstamp_work being re-queued after flush_workqueue() */
> + pdata->power_down = 1;
You can probably move this up even further, just after grabbing the lock.
> +
> + /* drop the lock to allow flush_workqueue() to sleep. */
> + spin_unlock_irqrestore(&pdata->lock, flags);
> flush_workqueue(pdata->dev_workqueue);
Rather than dropping and reacquiring the lock, I think you can get away
with moving this to after dropping the lock below.
Thanks,
Tom
> + spin_lock_irqsave(&pdata->lock, flags);
>
> hw_if->powerdown_tx(pdata);
> hw_if->powerdown_rx(pdata);
>
> xgbe_napi_disable(pdata, 0);
>
> - pdata->power_down = 1;
> -
> spin_unlock_irqrestore(&pdata->lock, flags);
>
> DBGPR("<--xgbe_powerdown\n");
>
Powered by blists - more mailing lists