[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b6286dcf-d9a4-4dbd-b8e4-5b0640c7dae5@amd.com>
Date: Thu, 11 Dec 2025 15:48:00 -0600
From: Mario Limonciello <mario.limonciello@....com>
To: Lizhi Hou <lizhi.hou@....com>, ogabbay@...nel.org,
quic_jhugo@...cinc.com, dri-devel@...ts.freedesktop.org,
maciej.falkowski@...ux.intel.com
Cc: linux-kernel@...r.kernel.org, max.zhen@....com, sonal.santan@....com
Subject: Re: [PATCH V1] accel/amdxdna: Fix race where send ring appears full
due to delayed head update
On 12/10/25 10:51 PM, Lizhi Hou wrote:
> The firmware sends a response and interrupts the driver before advancing
> the mailbox send ring head pointer.
What's the point of the interrupt then? Is this possible to improve in
future firmware or is this really a hardware issue? If it can be fixed
in firmware it would be ideal to conditionalize such behavior on
firmware version.
> As a result, the driver may observe
> the response and attempt to send a new request before the firmware has
> updated the head pointer. In this window, the send ring still appears
> full, causing the driver to incorrectly fail the send operation.
>
> This race can be triggered more easily in a multithreaded environment,
> leading to unexpected and spurious "send ring full" failures.
>
> To address this, poll the send ring head pointer for up to 100us before
> returning a full-ring condition. This allows the firmware time to update
> the head pointer.
>
> Fixes: b87f920b9344 ("accel/amdxdna: Support hardware mailbox")
> Signed-off-by: Lizhi Hou <lizhi.hou@....com>
> ---
> drivers/accel/amdxdna/amdxdna_mailbox.c | 27 +++++++++++++++----------
> 1 file changed, 16 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/accel/amdxdna/amdxdna_mailbox.c b/drivers/accel/amdxdna/amdxdna_mailbox.c
> index a60a85ce564c..469242ed8224 100644
> --- a/drivers/accel/amdxdna/amdxdna_mailbox.c
> +++ b/drivers/accel/amdxdna/amdxdna_mailbox.c
> @@ -191,26 +191,34 @@ mailbox_send_msg(struct mailbox_channel *mb_chann, struct mailbox_msg *mb_msg)
> u32 head, tail;
> u32 start_addr;
> u32 tmp_tail;
> + int ret;
>
> head = mailbox_get_headptr(mb_chann, CHAN_RES_X2I);
> tail = mb_chann->x2i_tail;
> - ringbuf_size = mailbox_get_ringbuf_size(mb_chann, CHAN_RES_X2I);
> + ringbuf_size = mailbox_get_ringbuf_size(mb_chann, CHAN_RES_X2I) - sizeof(u32);
> start_addr = mb_chann->res[CHAN_RES_X2I].rb_start_addr;
> tmp_tail = tail + mb_msg->pkg_size;
>
> - if (tail < head && tmp_tail >= head)
> - goto no_space;
> -
> - if (tail >= head && (tmp_tail > ringbuf_size - sizeof(u32) &&
> - mb_msg->pkg_size >= head))
> - goto no_space;
>
> - if (tail >= head && tmp_tail > ringbuf_size - sizeof(u32)) {
> +check_again:
> + if (tail >= head && tmp_tail > ringbuf_size) {
> write_addr = mb_chann->mb->res.ringbuf_base + start_addr + tail;
> writel(TOMBSTONE, write_addr);
>
> /* tombstone is set. Write from the start of the ringbuf */
> tail = 0;
> + tmp_tail = tail + mb_msg->pkg_size;
> + }
> +
> + if (tail < head && tmp_tail >= head) {
> + ret = read_poll_timeout(mailbox_get_headptr, head,
> + tmp_tail < head || tail >= head,
> + 1, 100, false, mb_chann, CHAN_RES_X2I);
> + if (ret)
> + return ret;
> +
> + if (tail >= head)
> + goto check_again;
> }
>
> write_addr = mb_chann->mb->res.ringbuf_base + start_addr + tail;
> @@ -222,9 +230,6 @@ mailbox_send_msg(struct mailbox_channel *mb_chann, struct mailbox_msg *mb_msg)
> mb_msg->pkg.header.id);
>
> return 0;
> -
> -no_space:
> - return -ENOSPC;
> }
>
> static int
Powered by blists - more mailing lists