lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 3 May 2023 09:11:15 -0500
From:   Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>
To:     Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>,
        Vinod Koul <vkoul@...nel.org>,
        Bard Liao <yung-chuan.liao@...ux.intel.com>,
        Sanyog Kale <sanyog.r.kale@...el.com>,
        Andy Gross <agross@...nel.org>,
        Bjorn Andersson <andersson@...nel.org>,
        Konrad Dybcio <konrad.dybcio@...aro.org>,
        Liam Girdwood <lgirdwood@...il.com>,
        Mark Brown <broonie@...nel.org>,
        Jaroslav Kysela <perex@...ex.cz>,
        Takashi Iwai <tiwai@...e.com>, alsa-devel@...a-project.org,
        linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org
Cc:     Patrick Lai <quic_plai@...cinc.com>
Subject: Re: [PATCH 6/6] soundwire: qcom: do not probe devices before bus/link
 init



On 5/3/23 03:00, Krzysztof Kozlowski wrote:
> On 01/05/2023 15:43, Pierre-Louis Bossart wrote:
>>
>>
>> On 5/1/23 07:24, Krzysztof Kozlowski wrote:
>>> On 20/04/2023 23:37, Pierre-Louis Bossart wrote:
>>>>
>>>>
>>>> On 4/20/23 05:16, Krzysztof Kozlowski wrote:
>>>>> Soundwire devices are supposed to be kept in reset state (powered off)
>>>>> till their probe() or component bind() callbacks.  However if they are
>>>>> already powered on, then they might enumerate before the master
>>>>> initializes bus in qcom_swrm_init() leading to occasional errors like:
>>>>
>>>> The problem statement is really hard to follow.
>>>>
>>>> The peripheral can only be enumerated AFTER
>>>> a) the manager starts the bus clock and transmitting PING frames
>>>> b) the peripheral detects the sync words for 16 frames in a row.
>>>> c) the peripheral reports as Attached in the Device0 slot
>>>>
>>>> That sequence holds whether the manager does the enumeration manually or
>>>> relies on hardware-assisted autoenumeration. This is what the spec requires.
>>>>
>>>> So why can't the bus clock start be controlled by the manager driver,
>>>> and started once all required initializations are done?
>>>>
>>>> I mean, there's got to be some sort of parent-child hierarchy with
>>>> manager first, peripheral(s) second, I don't get how these steps could
>>>> be inverted or race.
>>>>
>>>>>   qcom-soundwire 6d30000.soundwire-controller: Qualcomm Soundwire controller v2.0.0 Registered
>>>>>   wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:4 (ops wcd938x_sdw_component_ops)
>>>>>   wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:3 (ops wcd938x_sdw_component_ops)
>>>>>   qcom-soundwire 6ad0000.soundwire-controller: swrm_wait_for_wr_fifo_avail err write overflow
>>>>>
>>>>> The problem primarily lies in Qualcomm Soundwire controller probe() sequence:
>>>>> 1. request_threaded_irq()
>>>>> 2. sdw_bus_master_add() - which will cause probe() and component bind()
>>>>>    of Soundwire devices, e.g. WCD938x codec drivers.  Device drivers
>>>>>    might already start accessing their registers.
>>>>
>>>> not if the bus clock hasn't started...
>>>>
>>>>> 3. qcom_swrm_init() - which initializes the link/bus and enables
>>>>>    interrupts.
>>>>
>>>> if you can move the clock start in 3) then problem solved. Why can't
>>>> this be done?
>>>
>>> Responding to all your three responses:
>>> The clock is enabled in this 3. qcom_swrm_init(), so the old code to my
>>> knowledge is written exactly how you expect.
>>>
>>> However even with stopped clock, the device enumerates at
>>> sdw_bus_master_add(), before anything is enabled.
>>
>> Erm, that's not physically possible...
>>
>> The peripheral can report as attached and be enumerated by the manager,
>> i.e. assigned a non-zero "Device Number" after the peripheral
>> synchronizes on 16 frames with valid static and dynamic syncwords. That
>> can only happen if there is a clock toggling and PING frames transmitted
>> on the data line.
>>
>> There's something else at play here.
> 
> Yes, I think you are right and that "else" is my limited knowledge on
> the entire setup.
> 
> You gave me awesome hint in email before that probe != enumeration !=
> initialization, however the wcd938x sound codec drivers were assuming
> some steps are equal.
> 
> wcd938x comes with three devices on two drivers:
> 1. RX Soundwire device (wcd938x-sdw.c driver)
> 2. TX Soundwire device, which is used as regmap (also wcd938x-sdw.c driver)
> 3. platform device (wcd938x.c driver) - glue and component master,
> actually having most of the code using TX Soundwire device regmap.
> 
> The probe of each RX and TX Soundwire devices added components, but that
> this did not mean devices were enumerated, as you said.
> 
> Considering what Mark said about using regcache (and sync it), I am now
> replacing entire solution with proper regcache handling device
> enumeration after component bind.

I thought you were talking about ASoC "struct snd_soc_component" but no,
that's "struct component_ops" in wcd938x-sdw.c

Now I think I get the scope of the problem, I didn't click on the fact
that it's a platform driver that registers ASoC components, and not the
SoundWire codec driver.

That's certainly more complicated than any other SoundWire-based systems
I've seen so far. It's already difficult to keep track of the SoundWire
peripheral driver .probe(), the ASoC component .probe(), the SoundWire
bus .update_status() callback and now there's
the component .bind() added.

Wow. What could possibly go wrong, eh?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ