lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d21fe935-5732-7f8c-2c26-3fc3969b4c1c@redhat.com>
Date:   Tue, 30 Oct 2018 15:48:06 +0100
From:   Hans de Goede <hdegoede@...hat.com>
To:     Dean Wallace <duffydack73@...il.com>
Cc:     Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>,
        Andy Shevchenko <andy.shevchenko@...il.com>,
        Stephen Boyd <sboyd@...nel.org>,
        Michael Turquette <mturquette@...libre.com>,
        linux-clk <linux-clk@...r.kernel.org>,
        Stable <stable@...r.kernel.org>,
        Johannes Stezenbach <js@...21.net>,
        Carlo Caione <carlo@...lessm.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Mogens Jensen <mogens-jensen@...tonmail.com>
Subject: Re: Regression found (Stop-marking-clocks-as-CLK_IS_CRITICAL)

Hi,

On 30-10-18 15:38, Dean Wallace wrote:
> On 30-10-18, Hans de Goede wrote:
>> Hi Dean,
>>
>> Attached are 2 different attempts at fixing this.
>>
>> When trying these patches do not forget to remove the revert of the
>> "Stop-marking-clocks-as-CLK_IS_CRITICAL" commit.
>>
>> Please first try the 0001-ASoC-intel-cht_bsw_max98090_ti-Use-pmc_plt_clk_0-ins.patch
>> patch I expect that one to do the trick indicating that the Swanky model
>> uses a different pmc clk then which is normally used for the codec clock.
>>
>> If that patch does not fix things, please give the other patch a try.
>>
>> Regards,
>>
>> Hans
>>
>>
>>
>> On 29-10-18 23:03, Pierre-Louis Bossart wrote:
>>>
>>> On 10/29/18 2:08 PM, Dean Wallace wrote:
>>>> On 29-10-18, Andy Shevchenko wrote:
>>>>> On Mon, Oct 29, 2018 at 7:52 PM Andy Shevchenko
>>>>> <andy.shevchenko@...il.com> wrote:
>>>>>> Cc: Pierre as well.
>>>>>>
>>>>>> On Mon, Oct 29, 2018 at 7:48 PM Stephen Boyd <sboyd@...nel.org> wrote:
>>>>>>> Quoting Dean Wallace (2018-10-25 16:25:17)
>>>>>>>> I have found a regression in 4.18.15 that means I lose sound on my old
>>>>>>>> Toshiba Chromebook 2 (Swanky).  My system details are:-
>>>>>>>>
>>>>>>>> Toshiba Chromebook (Swanky)
>>>>>>>> MrChromebox UEFI coreboot
>>>>>>>> Arch Linux running latest alsa/pulseaudio
>>>>>>>>
>>>>>>>> Upgraded kernel from 4.18.14 to 4.18.15 and lost all sound output.  By
>>>>>>>> output I mean, the card is still detected, the module loaded, all apps
>>>>>>>> showing sound is being playing, but no actual audible sound comes
>>>>>>>> through.  Upgraded to 4.18.16 same issue.
>>>>>>>>
>>>>>>>> Dug around and found Upstream commit 648e921888ad96ea3dc922739e96716ad3225d7f
>>>>>>>> clk: x86: Stop marking clocks as CLK_IS_CRITICAL
>>>>>>>> "This commit removes the CLK_IS_CRITICAL marking, fixing Cherry Trail
>>>>>>>> devices not being able to reach S0i3 greatly decreasing their battery
>>>>>>>> drain when suspended."
>>>>>>>>
>>>>>>>> I reverted it and compiled 4.18.16 and have sound back again.  Could
>>>>>>>> this be looked into, with possibility of fix.
>>>>>>>>
>>>>>>> Thanks for the bug report. I'm adding some people involved in the commit
>>>>>>> you mention is causing audio regressions. The best plan is to probably
>>>>>>> revert the commit from the 4.18 linux stable tree. Or there may be
>>>>>>> another patch missing that would be useful to make this backported patch
>>>>>>> work. Hopefully Hans or Andy knows.
>>>>>> Hans has been investigating S0ix issues on Baytrail and Cherrytrail machines.
>>>>>> I have a feeling that the problem can be fixed by properly handling
>>>>>> clock in ASoC driver(s). Perhaps Hans and Pierre can figure this out
>>>>>> better than me.
>>>>> Looking to sound/soc/intel/boards/cht_bsw_rt5672.c I see no
>>>>> suspend-resume hooks. Perhaps, adding them like in the commit
>>>>> ac8bd9e13be2 ("r8169: Disable clk during suspend / resume") would
>>>>> help.
>>>
>>> I missed this change while i was away. It's indeed the expectation that the audio mclk is handled by the firmware,
>>
>> I believe that the statement "It's indeed the expectation that the
>> audio mclk is handled by the firmware" is not correct for BYT/CHT
>> platforms (and the commit causing the regression only affects BYT/CHT
>> platforms). Various machine drivers under sound/soc/intel/boards have
>> code to deal with the mclk themselves, like this:
>>
>>          drv->mclk = devm_clk_get(&pdev->dev, "pmc_plt_clk_3");
>> ...
>>
>>          if (SND_SOC_DAPM_EVENT_ON(event)) {
>>                  ret = clk_prepare_enable(ctx->mclk);
>> 		...
>>          } else {
>>                  clk_disable_unprepare(ctx->mclk);
>>          }
>>
>> The above code is from sound/soc/intel/boards/cht_bsw_max98090_ti.c
>> which is the machine driver used on the machines for which problems
>> are now being reported.
>>
>> And my commit introducing the problem is in essence a revert of:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=d31fd43c0f9a41e2678a1e78c0f22f0384c6edd3
>>
>> Which is from 2017-07-14 and the ASoC code for BYT/CHT platforms is
>> much older then that.
>>
>> Also I asked Carlo why he wrote that patch and it was to fix a problem
>> with ethernet on some laptops.
>>
>>> not sure I understand why removing the CLK_IS_CRITICAL was necessary or what it has to do with S0ix.
>>
>> It is necessary, because if it is set the clock never gets disabled and
>> on x86 platforms using suspend2idle we are responsible for *all* the hardware
>> power-management as OS. If we do not disable the clocks then we can only
>> reach S0i1 instead of S0i3 when suspended leading to increased battery
>> drain during suspend.
>>
>> Also note that this patch is only causing problems on CHT + max98090 codec
>> using machines. I've tested it on many other BYT/CHT machines myself and
>> we've not had any other bug reports related to this.
>>
>> So this clearly points to a problem with the clock management in the
>> cht_bsw_max98090_ti.c machine driver.
>>
>> I've some ideas how to fix this and I will prepare some patches to test.
>>
>> Regards,
>>
>> Hans
> 
> Excellent work Hans.  Compiled 4.19 with
> 0001-ASoC-intel-cht_bsw_max98090_ti-Use-pmc_plt_clk_0-ins.patch, sound
> works as before.
> 
> for i in /sys/kernel/debug/clk/pmc_plt_clk_?; do echo -n "$i: "; cat $i/clk_flags; echo; done
> /sys/kernel/debug/clk/pmc_plt_clk_0:
> /sys/kernel/debug/clk/pmc_plt_clk_1:
> /sys/kernel/debug/clk/pmc_plt_clk_2:
> /sys/kernel/debug/clk/pmc_plt_clk_3:
> /sys/kernel/debug/clk/pmc_plt_clk_4:
> /sys/kernel/debug/clk/pmc_plt_clk_5:

Ok, so as I expected the Swanky is using pmc_plt_clk_0
as mclk instead of pmc_plt_clk_3. Now the question
becomes is this true for all the designs using the
max98090 codec?

Mogens, can you give the 0001-ASoC-intel-cht_bsw_max98090_ti-Use-pmc_plt_clk_0-ins.patch
a try on your Clapper model, using the SND_SOC_INTEL_CHT_BSW_MAX98090_TI_MACH
kernel option together with the asoundrc from Dean?

I have the feeling that your Clapper is also using the
pmc_plt_clk_0.

Pierre-Louis, as maintainer/reviewer of all the drivers under
sound/soc/intel/boards, how would you like to proceed with this?

I see 2 possible ways forwards:

1) Unconditionally use pmc_plt_clk_0 as mclk (as my test patch does)
in the cht_bsw_max98090_ti.c machine driver

2) Use DMI quirks for the Swansky (and probably also the Clapper) to
use pmc_plt_clk_0 as mclk there while keeping pmc_plt_clk_3 as the
default.

If you (Pierre-Louis) can let me know which solution you prepare then
I can prepare and submit a patch to fix this.

Regards,

Hans

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ