[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <unc6tesuw2b7wi3nqacikah2wxbefmnlufjn7x3gidqbo3u5gg@jopyg7nj5ckf>
Date: Thu, 17 Aug 2023 07:24:15 -0700
From: Davidlohr Bueso <dave@...olabs.net>
To: Chengfeng Ye <dg573847474@...il.com>
Cc: hare@...e.de, jejb@...ux.ibm.com, martin.petersen@...cle.com,
bigeasy@...utronix.de, satishkh@...co.com, sebaddel@...co.com,
kartilak@...co.com, linux-scsi@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH RESEND] scsi: fcoe: Fix potential deadlock on
&fip->ctlr_lock
On Wed, 16 Aug 2023, Chengfeng Ye wrote:
>There is a long call chain that &fip->ctlr_lock is acquired by isr
>fnic_isr_msix_wq_copy() under hard irq context. Thus other process
>context code acquiring the lock should disable irq, otherwise
>deadlock could happen if the irq preempt the execution while the
>lock is held in process context on the same CPU.
>
>[ISR]
>fnic_isr_msix_wq_copy()
> -> fnic_wq_copy_cmpl_handler()
> -> fnic_fcpio_cmpl_handler()
> -> fnic_fcpio_flogi_reg_cmpl_handler()
> -> fnic_flush_tx()
> -> fnic_send_frame()
> -> fcoe_ctlr_els_send()
> -> spin_lock_bh(&fip->ctlr_lock)
>
>[Process Context]
>1. fcoe_ctlr_timer_work()
> -> fcoe_ctlr_flogi_send()
> -> spin_lock_bh(&fip->ctlr_lock)
>
>2. fcoe_ctlr_recv_work()
> -> fcoe_ctlr_recv_handler()
> -> fcoe_ctlr_recv_els()
> -> fcoe_ctlr_announce()
> -> spin_lock_bh(&fip->ctlr_lock)
>
>3. fcoe_ctlr_recv_work()
> -> fcoe_ctlr_recv_handler()
> -> fcoe_ctlr_recv_els()
> -> fcoe_ctlr_flogi_retry()
> -> spin_lock_bh(&fip->ctlr_lock)
>
>4. -> fcoe_xmit()
> -> fcoe_ctlr_els_send()
> -> spin_lock_bh(&fip->ctlr_lock)
>
>spin_lock_bh() is not enough since fnic_isr_msix_wq_copy() is a
>hardirq.
>
>These flaws were found by an experimental static analysis tool I am
>developing for irq-related deadlock.
>
>The patch fix the potential deadlocks by spin_lock_irqsave() to
>disable hard irq.
Reviewed-by: Davidlohr Bueso <dave@...olabs.net>
Powered by blists - more mailing lists