lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f80c7bf2-1d9e-40d7-d3bd-c8bbe548934b@linux.intel.com>
Date:   Tue, 21 Jun 2022 08:11:48 -0500
From:   Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>
To:     noman pouigt <variksla@...il.com>, Mark Brown <broonie@...nel.org>,
        liam.r.girdwood@...ux.intel.com
Cc:     alsa-devel@...a-project.org, peter.ujfalusi@...ux.intel.com,
        ranjani.sridharan@...ux.intel.com, linux-kernel@...r.kernel.org,
        yangyingliang@...wei.com, rander.wang@...el.com,
        yung-chuan.liao@...ux.intel.com,
        sound-open-firmware@...a-project.org
Subject: Re: [Sound-open-firmware] ASoC: SOF: Race condition in ipc.c



On 6/20/22 21:46, noman pouigt wrote:
> Folks,
> 
> I have borrowed part of SOF architecture for my own DSP
> framework development as the memory on the DSP is
> extremely small and wouldn't be able to support SOF.

You're providing very little context here. Of course it's open-source
and you can do whatever you want with the code, but it's not clear if
the 'borrowed' code can deal with your specific case.

> Currently I am running into a race condition as below:
> 
> CPU                                            DSP
> PCM_TRIGGER_START
> sof_ipc_send_msg ---->
> 
>                                       <------Immediate ACK
> ipc3_wait_tx_done
> (wait_event_timeout)
>                                       <------ POSITION update
> 
> snd_pcm_period_elapsed
> 
> 
> As you can see TRIGGER_START didn't even finish
> and waiting for it to complete in ipc3_wait_tx_done
> function. However, before it could complete the position
> interrupt was issued which results in calling period_elapsed
> function.
> 
> In order to fix this I assume below is called in SOF framework:
> schedule_work(&spcm->stream[substream->stream].period_elapsed_work);
> 
> How is this design working? If the interrupt is coming too fast
> from the DSP than the associated function with this schedule_work
> will not get called as the scheduler will not get time to schedule the
> workqueue and elapsed function will not be called thereby not increasing
> the hw_ptr. How is the flow control for data transfer achieved?

The schedule_work was added by this commit, and the explanations are
rather straightforward:

commit e2803e610aecb36ea4fec5a04861547664580d0c

Author: Keyon Jie <yang.jie@...ux.intel.com>

Date:   Tue Apr 30 18:09:25 2019 -0500




ASoC: SOF: PCM: add period_elapsed work to fix race condition in
interrupt context

"
The IPC implementation in SOF requires sending IPCs serially: we should
not send a new IPC command to the firmware before we get an ACK (or time
out) from firmware, and the IRQ processing is complete.

snd_pcm_period_elapsed() can be called in interrupt context before
IRQ_HANDLED is returned. When the PCM is done draining, a STOP
IPC will then be sent, which breaks the expectation that IPCs are
handled serially and leads to IPC timeouts.

This patch adds a workqueue to defer the call to snd_pcm_elapsed() after
the IRQ is handled.
"

I am not sure what the problem you have really is.

It's not really surprising that the first period is consumed
immediately, the flow will become more steady-state afterwards.

The DMAs should be primed to deal with more than one period, and the
schedule_work() will only signal that the application can refill the
ring buffer. There's all kinds of delays that will happen depending on
load and scheduling, and if the POSITION_UPDATE was received immediately
then there's should be still plenty of time to provide a new buffer.

> 
> I am facing the above problem in my design.
> 
> I am wondering if I can simply add one more IPC command(don't call
> wait_event_interruptible for this) after TRIGGER_START to start the
> streaming.This way schedule_work for updating period_elapsed function
> can be avoided and it can be called in an interrupt context.

See commit above, this won't work because you'll be sending an IPC while
dealing with an IPC.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ