[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4ecf3a7a-7a68-fd56-ed93-fbae82e2b0e3@huawei.com>
Date: Tue, 11 Mar 2025 19:40:53 +0800
From: "lihuisong (C)" <lihuisong@...wei.com>
To: Sudeep Holla <sudeep.holla@....com>, <linux-acpi@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
CC: Jassi Brar <jassisinghbrar@...il.com>, Adam Young
<admiyo@...amperecomputing.com>, Robbie King <robbiek@...ghtlabs.com>
Subject: Re: [PATCH v2 01/13] mailbox: pcc: Fix the possible race in updation
of chan_in_use flag
在 2025/3/6 0:38, Sudeep Holla 写道:
> From: Huisong Li <lihuisong@...wei.com>
>
> The function mbox_chan_received_data() calls the Rx callback of the
> mailbox client driver. The callback might set chan_in_use flag from
> pcc_send_data(). This flag's status determines whether the PCC channel
> is in use.
>
> However, there is a potential race condition where chan_in_use is
> updated incorrectly due to concurrency between the interrupt handler
> (pcc_mbox_irq()) and the command sender(pcc_send_data()).
>
> The 'chan_in_use' flag of a channel is set to true after sending a
> command. And the flag of the new command may be cleared erroneous by
> the interrupt handler afer mbox_chan_received_data() returns,
>
> As a result, the interrupt being level triggered can't be cleared in
> pcc_mbox_irq() and it will be disabled after the number of handled times
> exceeds the specified value. The error log is as follows:
>
> | kunpeng_hccs HISI04B2:00: PCC command executed timeout!
> | kunpeng_hccs HISI04B2:00: get port link status info failed, ret = -110
> | irq 13: nobody cared (try booting with the "irqpoll" option)
> | Call trace:
> | dump_backtrace+0x0/0x210
> | show_stack+0x1c/0x2c
> | dump_stack+0xec/0x130
> | __report_bad_irq+0x50/0x190
> | note_interrupt+0x1e4/0x260
> | handle_irq_event+0x144/0x17c
> | handle_fasteoi_irq+0xd0/0x240
> | __handle_domain_irq+0x80/0xf0
> | gic_handle_irq+0x74/0x2d0
> | el1_irq+0xbc/0x140
> | mnt_clone_write+0x0/0x70
> | file_update_time+0xcc/0x160
> | fault_dirty_shared_page+0xe8/0x150
> | do_shared_fault+0x80/0x1d0
> | do_fault+0x118/0x1a4
> | handle_pte_fault+0x154/0x230
> | __handle_mm_fault+0x1ac/0x390
> | handle_mm_fault+0xf0/0x250
> | do_page_fault+0x184/0x454
> | do_translation_fault+0xac/0xd4
> | do_mem_abort+0x44/0xb4
> | el0_da+0x40/0x74
> | el0_sync_handler+0x60/0xb4
> | el0_sync+0x168/0x180
> | handlers:
> | pcc_mbox_irq
> | Disabling IRQ #13
>
> To solve this issue, pcc_mbox_irq() must clear 'chan_in_use' flag before
> the call to mbox_chan_received_data().
>
> Signed-off-by: Huisong Li <lihuisong@...wei.com>
> (sudeep.holla: Minor updates to the subject and commit message)
> Signed-off-by: Sudeep Holla <sudeep.holla@....com>
> ---
> drivers/mailbox/pcc.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> index 82102a4c5d68839170238540a6fed61afa5185a0..f2e4087281c70eeb5b9b33371596613a371dff4f 100644
> --- a/drivers/mailbox/pcc.c
> +++ b/drivers/mailbox/pcc.c
> @@ -333,10 +333,15 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
> if (pcc_chan_reg_read_modify_write(&pchan->plat_irq_ack))
> return IRQ_NONE;
>
> + /*
> + * Clear this flag immediately after updating interrupt ack register
> + * to avoid possible race in updatation of the flag from
> + * pcc_send_data() that could execute from mbox_chan_received_data()
This comment may be inappropriate becuase of the moving of clearing
interrupt ack register in patch 2/13.
I suggested that fix it in this patch or patch 2/13.
> + */
> + pchan->chan_in_use = false;
> mbox_chan_received_data(chan, NULL);
>
> check_and_ack(pchan, chan);
> - pchan->chan_in_use = false;
>
> return IRQ_HANDLED;
> }
>
Powered by blists - more mailing lists