linux-kernel - Re: [PATCH v3] soundwire: stream: Prepare ports in parallel to reduce stream start latency

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a1b2b617-448d-4457-891e-d2cc00858f50@opensource.cirrus.com>
Date: Mon, 22 Dec 2025 12:01:16 +0000
From: Richard Fitzgerald <rf@...nsource.cirrus.com>
To: Pierre-Louis Bossart <pierre-louis.bossart@...ux.dev>, vkoul@...nel.org,
        yung-chuan.liao@...ux.intel.com
Cc: linux-sound@...r.kernel.org, linux-kernel@...r.kernel.org,
        patches@...nsource.cirrus.com
Subject: Re: [PATCH v3] soundwire: stream: Prepare ports in parallel to reduce
 stream start latency

On 20/12/25 11:15, Pierre-Louis Bossart wrote:
> On 12/10/25 10:59, Richard Fitzgerald wrote:
>> On 9/12/25 16:41, Pierre-Louis Bossart wrote:
>>>
>>>>>> Changes in V2:
>>>>>> +    if (simple_ch_prep_sm)
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    /*
>>>>>> +     * Check if already prepared. Avoid overhead of waiting for interrupt
>>>>>> +     * and port_ready completion if we don't need to.
>>>>>> +     */
>>>
>>> 1.
>>>
>>>>>> +    val = sdw_read_no_pm(s_rt->slave, SDW_DPN_PREPARESTATUS(p_rt->num));
>>>>>> +    if (val < 0) {
>>>>>> +        ret = val;
>>>>>> +        goto err;
>>>>>> +    }
>>>>>> +
>>>>>> +    if (val & p_rt->ch_mask) {
>>>>>
>>>>> Can you explain why we don't use the ch_mask in the already-prepared case? I am missing something.
>>>>>
>>>> I'm not sure what you mean here. The if() immediately above your comment
>>>> uses ch_mask to check the already-prepared state.
>>>
>>> I was referring to the 1. above, you read the prepare status without checking for ch_mask first.
>>>
>>
>> What would be the purpose of checking ch_mask before the read?
> 
> I don't know - why do we need to read it in the second case and not the first is all I am asking.
>

What the code is doing is:

1. Read the current prepare status
2. Check if any channels we prepared are still reporting NOT_PREPARED
3. If there are any NOT_PREPARED channels we need to do the wait, else
    we can skip it

>>>>>> +        /* Wait for completion on port ready */
>>>>>> +        port_ready = &s_rt->slave->port_ready[p_rt->num];
>>>>>> +        wait_for_completion_timeout(port_ready, msecs_to_jiffies(ch_prep_timeout));
>>>>>
>>>>> I understand the code is the same as before but would there be any merit in checking the timeout before starting a read? If the device is already in the weeds, doing another read adds even more time before reporting an error.
>>>>>
>>>> Do you mean save the system time when the DPN_PREPARE was written to
>>>> that peripheral and then check here whether the timeout period has
>>>> already elapsed?
>>>
>>> I meant testing the return value of wait_for_completion_timeout(). If you already timed out at this point with a return value of zero, there's no point in checking the status any more, the system is in the weeds.
>>>
>>
>> Wait completion will _always_ timeout because this code is holding the
>> bus lock, which blocks the ALERT handler from running and signalling
>> the completion. The wait_for_completion_timeout() is effectively
>> msleep(msecs_to_jiffies(ch_prep_timeout));
>> So we have to read the register afterwards to see whether the peripheral
>> actually prepared.
>>
>> I've left the useless wait_for_completion_timeout() in the code so this
>> commit is only changing what it says it is changing, and nothing else.
>>
>> What to do about the deadlocked wait_for_completion_timeout() is a
>> separate problem.
> 
> Humm, what happens if you have a single peripheral, does this result in an increase of the prepare time all the way to the timeout value?
> I can see how preparing all ports in parallel would reduce the total time compared to a serial approach, even with a timeout, but if we end-up always timing out even in the case where there is a single device it'd be quite odd.

Yes, but that's a separate problem that needs fixing. The current code
in the kernel does the timeout for every amp. So if you have 4 amps with
a timeout of 10ms it takes >40ms to do the port prepare.

> In other words does this patch reduce the start latency only in the case of multiple devices, but adds a 'tax' for all other cases?

This only affects peripherals that use the full CP_SM (not simplified).

If you have only 1 peripheral there is either no change or a possibility
of skipping the wait. Before this change it would always wait the full
timeout before noticing that it is ready.

After this change, you might be able to skip the wait. If it prepared
immediately, so that it is already reporting prepared when the first
read tests for that, it will skip the wait.
(BTW I see that happening with CS35L56. It prepares so fast that it
is already done, without waiting for any of the timeouts.)

If it isn't ready that early, you get the timeout as before.

It isn't adding any penalty to single devices, unless you count the
trivial time it takes to do the register read, which is negligible
compared to getting deadlocked in the wait_for_completion_timeout().