lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <934557d0-0c5a-76d5-e8f6-a8683210b0ed@amd.com>
Date:   Tue, 15 Oct 2019 15:26:07 +0000
From:   "Lendacky, Thomas" <Thomas.Lendacky@....com>
To:     James Morse <james.morse@....com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC:     "Dave S . Miller" <davem@...emloft.net>
Subject: Re: [RFC PATCH net 1/2] amd-xgbe: Avoid sleeping in flush_workqueue()
 while holding a spinlock

On 10/15/19 8:49 AM, James Morse wrote:
> xgbe_powerdown() takes an irqsave spinlock, then calls flush_workqueue()
> which takes a mutex. DEBUG_ATOMIC_SLEEP isn't happy about this:
> | BUG: sleeping function called from invalid context at [...] mutex.c:281
> | in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 2733, name: bash
> | CPU: 4 PID: 2733 Comm: bash Tainted: G        W         5.4.0-rc3 #113
> | Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT)
> | Call trace:
> |  show_stack+0x24/0x30
> |  dump_stack+0xb0/0xf8
> |  ___might_sleep+0x124/0x148
> |  __might_sleep+0x54/0x90
> |  mutex_lock+0x2c/0x80
> |  flush_workqueue+0x84/0x420
> |  xgbe_powerdown+0x6c/0x108
> |  xgbe_platform_suspend+0x34/0x80
> |  pm_generic_freeze+0x3c/0x58
> |  acpi_subsys_freeze+0x2c/0x38
> |  dpm_run_callback+0x3c/0x1e8
> |  __device_suspend+0x130/0x468
> |  dpm_suspend+0x114/0x388
> |  hibernation_snapshot+0xe8/0x378
> |  hibernate+0x18c/0x2f8
> 
> Drop the lock for the duration of xgbe_powerdown(). We have already
> stopeed the timers, so service_work can't be re-queued. Move the
> pdata->power_down flag earlier so that it can be used by the interrupt
> handler to know not to re-queue the tx_tstamp_work.
> 
> Signed-off-by: James Morse <james.morse@....com>
> 
> ---
> RFC as I'm not familiar with this driver. I'm happy to test a better fix!
> ---
>  drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 22 ++++++++++++++++++----
>  1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> index 98f8f2033154..bfba7effcf9f 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> @@ -480,6 +480,8 @@ static void xgbe_isr_task(unsigned long data)
>  	struct xgbe_channel *channel;
>  	unsigned int dma_isr, dma_ch_isr;
>  	unsigned int mac_isr, mac_tssr, mac_mdioisr;
> +	unsigned long flags;
> +	bool power_down;
>  	unsigned int i;
>  
>  	/* The DMA interrupt status register also reports MAC and MTL
> @@ -558,8 +560,14 @@ static void xgbe_isr_task(unsigned long data)
>  				/* Read Tx Timestamp to clear interrupt */
>  				pdata->tx_tstamp =
>  					hw_if->get_tx_tstamp(pdata);
> -				queue_work(pdata->dev_workqueue,
> -					   &pdata->tx_tstamp_work);
> +
> +				spin_lock_irqsave(&pdata->lock, flags);
> +				power_down = pdata->power_down;
> +				spin_unlock_irqrestore(&pdata->lock, flags);
> +
> +				if (!power_down)
> +					queue_work(pdata->dev_workqueue,
> +						   &pdata->tx_tstamp_work);

Hmm, I think this could race.  You probably want to protect queue_work()
with the spin lock, too, in which case you won't need the power_down var.

>  			}
>  		}
>  
> @@ -1256,16 +1264,22 @@ int xgbe_powerdown(struct net_device *netdev, unsigned int caller)
>  
>  	netif_tx_stop_all_queues(netdev);
>  
> +	/* Stop service_work being re-queued by the service_timer */
>  	xgbe_stop_timers(pdata);
> +
> +	/* Stop tx_tstamp_work being re-queued after flush_workqueue() */
> +	pdata->power_down = 1;

You can probably move this up even further, just after grabbing the lock.

> +
> +	/* drop the lock to allow flush_workqueue() to sleep. */
> +	spin_unlock_irqrestore(&pdata->lock, flags);
>  	flush_workqueue(pdata->dev_workqueue);

Rather than dropping and reacquiring the lock, I think you can get away
with moving this to after dropping the lock below.

Thanks,
Tom

> +	spin_lock_irqsave(&pdata->lock, flags);
>  
>  	hw_if->powerdown_tx(pdata);
>  	hw_if->powerdown_rx(pdata);
>  
>  	xgbe_napi_disable(pdata, 0);
>  
> -	pdata->power_down = 1;
> -
>  	spin_unlock_irqrestore(&pdata->lock, flags);
>  
>  	DBGPR("<--xgbe_powerdown\n");
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ