[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b9c412cf-32af-4cc7-af6d-d4be5b796ccd@amd.com>
Date: Fri, 7 Feb 2025 18:05:18 +0530
From: "Mukunda,Vijendar" <vijendar.mukunda@....com>
To: Cristian Ciocaltea <cristian.ciocaltea@...labora.com>,
Liam Girdwood <lgirdwood@...il.com>,
Peter Ujfalusi <peter.ujfalusi@...ux.intel.com>,
Bard Liao <yung-chuan.liao@...ux.intel.com>,
Ranjani Sridharan <ranjani.sridharan@...ux.intel.com>,
Daniel Baluta <daniel.baluta@....com>,
Kai Vehmanen <kai.vehmanen@...ux.intel.com>,
Pierre-Louis Bossart <pierre-louis.bossart@...ux.dev>,
Mark Brown <broonie@...nel.org>, Jaroslav Kysela <perex@...ex.cz>,
Takashi Iwai <tiwai@...e.com>,
Venkata Prasad Potturu <venkataprasad.potturu@....com>,
"Hiregoudar, Basavaraj" <Basavaraj.Hiregoudar@....com>,
"Dommati, Sunil-kumar" <Sunil-kumar.Dommati@....com>
Cc: kernel@...labora.com, sound-open-firmware@...a-project.org,
linux-sound@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/4] ASoC: SOF: amd: Add post_fw_run_delay ACP quirk
On 07/02/25 17:16, Cristian Ciocaltea wrote:
> Stress testing resume from suspend on Valve Steam Deck OLED (Galileo)
> revealed that the DSP firmware could enter an unrecoverable faulty
> state, where the kernel ring buffer is flooded with IPC related error
> messages:
>
> [ +0.017002] snd_sof_amd_vangogh 0000:04:00.5: acp_sof_ipc_send_msg: Failed to acquire HW lock
> [ +0.000054] snd_sof_amd_vangogh 0000:04:00.5: ipc3_tx_msg_unlocked: ipc message send for 0x30100000 failed: -22
> [ +0.000005] snd_sof_amd_vangogh 0000:04:00.5: Failed to setup widget PIPELINE.6.ACPHS1.IN
> [ +0.000004] snd_sof_amd_vangogh 0000:04:00.5: Failed to restore pipeline after resume -22
> [ +0.000003] snd_sof_amd_vangogh 0000:04:00.5: PM: dpm_run_callback(): pci_pm_resume returns -22
> [ +0.000009] snd_sof_amd_vangogh 0000:04:00.5: PM: failed to resume async: error -22
> [...]
> [ +0.002582] PM: suspend exit
> [ +0.065085] snd_sof_amd_vangogh 0000:04:00.5: ipc tx error for 0x30130000 (msg/reply size: 12/0): -22
> [ +0.000499] snd_sof_amd_vangogh 0000:04:00.5: error: failed widget list set up for pcm 1 dir 0
> [ +0.000011] snd_sof_amd_vangogh 0000:04:00.5: error: set pcm hw_params after resume
> [ +0.000006] snd_sof_amd_vangogh 0000:04:00.5: ASoC: error at snd_soc_pcm_component_prepare on 0000:04:00.5: -22
> [...]
>
> A system reboot would be necessary to restore the speakers
> functionality.
>
> However, by delaying a bit any host to DSP transmission right after
> the firmware boot completed, the issue could not be reproduced anymore
> and sound continued to work flawlessly even after performing thousands
> of suspend/resume cycles.
>
> Introduce the post_fw_run_delay ACP quirk to allow providing the
> aforementioned delay via the snd_sof_dsp_ops->post_fw_run() callback for
> the affected devices.
As per our understanding, Till ACP firmware starts responding to the
driver's IPC messages, FW won't acquire the HW lock.
Adding delay in post_fw_run() callback seems to not a correct solution,
as there are no IPC messages will be sent from host till FW boot ready
is received.
Is fail to acquire the HW lock is for first host IPC message?
>
> Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@...labora.com>
> ---
> sound/soc/sof/amd/acp.c | 1 +
> sound/soc/sof/amd/acp.h | 1 +
> sound/soc/sof/amd/vangogh.c | 18 ++++++++++++++++++
> 3 files changed, 20 insertions(+)
>
> diff --git a/sound/soc/sof/amd/acp.c b/sound/soc/sof/amd/acp.c
> index 33648ff8b83365e76d7d90e52c2cb8f884a2fe72..9e13c96528be3371e063072513c118cfc8b93fe8 100644
> --- a/sound/soc/sof/amd/acp.c
> +++ b/sound/soc/sof/amd/acp.c
> @@ -27,6 +27,7 @@ MODULE_PARM_DESC(enable_fw_debug, "Enable Firmware debug");
> static struct acp_quirk_entry quirk_valve_galileo = {
> .signed_fw_image = true,
> .skip_iram_dram_size_mod = true,
> + .post_fw_run_delay = true,
> };
>
> const struct dmi_system_id acp_sof_quirk_table[] = {
> diff --git a/sound/soc/sof/amd/acp.h b/sound/soc/sof/amd/acp.h
> index 800594440f73914e7b8ccaf86369ac686e1da630..2a19d82d6200223cdfccd49fbcf1b52968ae1230 100644
> --- a/sound/soc/sof/amd/acp.h
> +++ b/sound/soc/sof/amd/acp.h
> @@ -220,6 +220,7 @@ struct sof_amd_acp_desc {
> struct acp_quirk_entry {
> bool signed_fw_image;
> bool skip_iram_dram_size_mod;
> + bool post_fw_run_delay;
> };
>
> /* Common device data struct for ACP devices */
> diff --git a/sound/soc/sof/amd/vangogh.c b/sound/soc/sof/amd/vangogh.c
> index 8e2672106ac60a22824836a944503a05616f8661..d5f1dddd43e72dd62b3d031130193e6125edf9df 100644
> --- a/sound/soc/sof/amd/vangogh.c
> +++ b/sound/soc/sof/amd/vangogh.c
> @@ -11,6 +11,7 @@
> * Hardware interface for Audio DSP on Vangogh platform
> */
>
> +#include <linux/delay.h>
> #include <linux/platform_device.h>
> #include <linux/module.h>
>
> @@ -136,6 +137,20 @@ static struct snd_soc_dai_driver vangogh_sof_dai[] = {
> },
> };
>
> +static int sof_vangogh_post_fw_run_delay(struct snd_sof_dev *sdev)
> +{
> + /*
> + * Resuming from suspend in some cases my cause the DSP firmware
> + * to enter an unrecoverable faulty state. Delaying a bit any host
> + * to DSP transmission right after firmware boot completion seems
> + * to resolve the issue.
> + */
> + if (!sdev->first_boot)
> + usleep_range(100, 150);
> +
> + return 0;
> +}
> +
> /* Vangogh ops */
> struct snd_sof_dsp_ops sof_vangogh_ops;
> EXPORT_SYMBOL_NS(sof_vangogh_ops, "SND_SOC_SOF_AMD_COMMON");
> @@ -157,6 +172,9 @@ int sof_vangogh_ops_init(struct snd_sof_dev *sdev)
>
> if (quirks->signed_fw_image)
> sof_vangogh_ops.load_firmware = acp_sof_load_signed_firmware;
> +
> + if (quirks->post_fw_run_delay)
> + sof_vangogh_ops.post_fw_run = sof_vangogh_post_fw_run_delay;
> }
>
> return 0;
>
Powered by blists - more mailing lists