[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <07406ab9-fbb0-4f98-56b1-0c64b7e695e1@broadcom.com>
Date: Wed, 19 Jul 2023 10:31:22 -0700
From: Ray Jui <ray.jui@...adcom.com>
To: Scott Branden <scott.branden@...adcom.com>,
Chengfeng Ye <dg573847474@...il.com>,
bcm-kernel-feedback-list@...adcom.com, arnd@...db.de,
gregkh@...uxfoundation.org
Cc: linux-kernel@...r.kernel.org, desmond.yan@...adcom.com
Subject: Re: [PATCH v3] misc: bcm_vk: Fix potential deadlock on &vk->ctx_lock
On 7/19/2023 10:07 AM, Scott Branden wrote:
> Works fine - thanks.
So apparently the choice of using the timer previously was not due to
performance reasons?
If performance is a concern by converting to use workqueue (now it runs
in process/thread context than softirq), I assume you are aware of
another easy way to fix this potential deadlock issue? :)
>
> On 2023-06-29 11:29, Chengfeng Ye wrote:
>> As &vk->ctx_lock is acquired by timer bcm_vk_hb_poll() under softirq
>> context, other process context code should disable irq or bottom-half
>> before acquire the same lock, otherwise deadlock could happen if the
>> timer preempt the execution while the lock is held in process context
>> on the same CPU.
>>
>> Possible deadlock scenario
>> bcm_vk_open()
>> -> bcm_vk_get_ctx()
>> -> spin_lock(&vk->ctx_lock)
>> <timer iterrupt>
>> -> bcm_vk_hb_poll()
>> -> bcm_vk_blk_drv_access()
>> -> spin_lock_irqsave(&vk->ctx_lock, flags) (deadlock here)
>>
>> This flaw was found using an experimental static analysis tool we are
>> developing for irq-related deadlock, which reported the following
>> warning when analyzing the linux kernel 6.4-rc7 release.
>>
>> [Deadlock]: &vk->ctx_lock
>> [Interrupt]: bcm_vk_hb_poll
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:176
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:512
>> [Locking Unit]: bcm_vk_ioctl
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:1181
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:512
>>
>> [Deadlock]: &vk->ctx_lock
>> [Interrupt]: bcm_vk_hb_poll
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:176
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:512
>> [Locking Unit]: bcm_vk_ioctl
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:1169
>>
>> [Deadlock]: &vk->ctx_lock
>> [Interrupt]: bcm_vk_hb_poll
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:176
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:512
>> [Locking Unit]: bcm_vk_open
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:216
>>
>> [Deadlock]: &vk->ctx_lock
>> [Interrupt]: bcm_vk_hb_poll
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:176
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:512
>> [Locking Unit]: bcm_vk_release
>> -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:306
>>
>> As suggested by Arnd, the tentative patch fix the potential deadlocks
>> by replacing the timer with delay workqueue. x86_64 allyesconfig using
>> GCC shows no new warning. Note that no runtime testing was performed
>> due to no device on hand.
>>
>> Signed-off-by: Chengfeng Ye <dg573847474@...il.com>
> Acked-by: Scott Branden <scott.branden@...adcom.com>
> Tested-by: Desmond Yan <desmond.branden@...adcom.com>
>
>> ---
>> drivers/misc/bcm-vk/bcm_vk.h | 2 +-
>> drivers/misc/bcm-vk/bcm_vk_msg.c | 14 +++++++-------
>> 2 files changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/misc/bcm-vk/bcm_vk.h b/drivers/misc/bcm-vk/bcm_vk.h
>> index 25d51222eedf..386884c2a263 100644
>> --- a/drivers/misc/bcm-vk/bcm_vk.h
>> +++ b/drivers/misc/bcm-vk/bcm_vk.h
>> @@ -340,7 +340,7 @@ struct bcm_vk_proc_mon_info {
>> };
>> struct bcm_vk_hb_ctrl {
>> - struct timer_list timer;
>> + struct delayed_work work;
>> u32 last_uptime;
>> u32 lost_cnt;
>> };
>> diff --git a/drivers/misc/bcm-vk/bcm_vk_msg.c
>> b/drivers/misc/bcm-vk/bcm_vk_msg.c
>> index 3c081504f38c..e17d81231ea6 100644
>> --- a/drivers/misc/bcm-vk/bcm_vk_msg.c
>> +++ b/drivers/misc/bcm-vk/bcm_vk_msg.c
>> @@ -137,11 +137,11 @@ void bcm_vk_set_host_alert(struct bcm_vk *vk,
>> u32 bit_mask)
>> #define BCM_VK_HB_TIMER_VALUE (BCM_VK_HB_TIMER_S * HZ)
>> #define BCM_VK_HB_LOST_MAX (27 / BCM_VK_HB_TIMER_S)
>> -static void bcm_vk_hb_poll(struct timer_list *t)
>> +static void bcm_vk_hb_poll(struct work_struct *work)
>> {
>> u32 uptime_s;
>> - struct bcm_vk_hb_ctrl *hb = container_of(t, struct bcm_vk_hb_ctrl,
>> - timer);
>> + struct bcm_vk_hb_ctrl *hb = container_of(to_delayed_work(work),
>> struct bcm_vk_hb_ctrl,
>> + work);
>> struct bcm_vk *vk = container_of(hb, struct bcm_vk, hb_ctrl);
>> if (bcm_vk_drv_access_ok(vk) && hb_mon_is_on()) {
>> @@ -177,22 +177,22 @@ static void bcm_vk_hb_poll(struct timer_list *t)
>> bcm_vk_set_host_alert(vk, ERR_LOG_HOST_HB_FAIL);
>> }
>> /* re-arm timer */
>> - mod_timer(&hb->timer, jiffies + BCM_VK_HB_TIMER_VALUE);
>> + schedule_delayed_work(&hb->work, BCM_VK_HB_TIMER_VALUE);
>> }
>> void bcm_vk_hb_init(struct bcm_vk *vk)
>> {
>> struct bcm_vk_hb_ctrl *hb = &vk->hb_ctrl;
>> - timer_setup(&hb->timer, bcm_vk_hb_poll, 0);
>> - mod_timer(&hb->timer, jiffies + BCM_VK_HB_TIMER_VALUE);
>> + INIT_DELAYED_WORK(&hb->work, bcm_vk_hb_poll);
>> + schedule_delayed_work(&hb->work, BCM_VK_HB_TIMER_VALUE);
>> }
>> void bcm_vk_hb_deinit(struct bcm_vk *vk)
>> {
>> struct bcm_vk_hb_ctrl *hb = &vk->hb_ctrl;
>> - del_timer(&hb->timer);
>> + cancel_delayed_work_sync(&hb->work);
>> }
>> static void bcm_vk_msgid_bitmap_clear(struct bcm_vk *vk,
Download attachment "smime.p7s" of type "application/pkcs7-signature" (4194 bytes)
Powered by blists - more mailing lists