lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ac7720fd-ed2e-dd77-d60b-7ab535daf167@redhat.com>
Date:   Tue, 30 Oct 2018 11:17:43 +0100
From:   Hans de Goede <hdegoede@...hat.com>
To:     Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>,
        Dean Wallace <duffydack73@...il.com>,
        Andy Shevchenko <andy.shevchenko@...il.com>
Cc:     Stephen Boyd <sboyd@...nel.org>,
        Michael Turquette <mturquette@...libre.com>,
        linux-clk <linux-clk@...r.kernel.org>,
        Stable <stable@...r.kernel.org>,
        Johannes Stezenbach <js@...21.net>,
        Carlo Caione <carlo@...lessm.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Regression found (Stop-marking-clocks-as-CLK_IS_CRITICAL)

Hi Pierre-Louis,

On 29-10-18 23:03, Pierre-Louis Bossart wrote:
> 
> On 10/29/18 2:08 PM, Dean Wallace wrote:
>> On 29-10-18, Andy Shevchenko wrote:
>>> On Mon, Oct 29, 2018 at 7:52 PM Andy Shevchenko
>>> <andy.shevchenko@...il.com> wrote:
>>>> Cc: Pierre as well.
>>>>
>>>> On Mon, Oct 29, 2018 at 7:48 PM Stephen Boyd <sboyd@...nel.org> wrote:
>>>>> Quoting Dean Wallace (2018-10-25 16:25:17)
>>>>>> I have found a regression in 4.18.15 that means I lose sound on my old
>>>>>> Toshiba Chromebook 2 (Swanky).  My system details are:-
>>>>>>
>>>>>> Toshiba Chromebook (Swanky)
>>>>>> MrChromebox UEFI coreboot
>>>>>> Arch Linux running latest alsa/pulseaudio
>>>>>>
>>>>>> Upgraded kernel from 4.18.14 to 4.18.15 and lost all sound output.  By
>>>>>> output I mean, the card is still detected, the module loaded, all apps
>>>>>> showing sound is being playing, but no actual audible sound comes
>>>>>> through.  Upgraded to 4.18.16 same issue.
>>>>>>
>>>>>> Dug around and found Upstream commit 648e921888ad96ea3dc922739e96716ad3225d7f
>>>>>> clk: x86: Stop marking clocks as CLK_IS_CRITICAL
>>>>>> "This commit removes the CLK_IS_CRITICAL marking, fixing Cherry Trail
>>>>>> devices not being able to reach S0i3 greatly decreasing their battery
>>>>>> drain when suspended."
>>>>>>
>>>>>> I reverted it and compiled 4.18.16 and have sound back again.  Could
>>>>>> this be looked into, with possibility of fix.
>>>>>>
>>>>> Thanks for the bug report. I'm adding some people involved in the commit
>>>>> you mention is causing audio regressions. The best plan is to probably
>>>>> revert the commit from the 4.18 linux stable tree. Or there may be
>>>>> another patch missing that would be useful to make this backported patch
>>>>> work. Hopefully Hans or Andy knows.
>>>> Hans has been investigating S0ix issues on Baytrail and Cherrytrail machines.
>>>> I have a feeling that the problem can be fixed by properly handling
>>>> clock in ASoC driver(s). Perhaps Hans and Pierre can figure this out
>>>> better than me.
>>> Looking to sound/soc/intel/boards/cht_bsw_rt5672.c I see no
>>> suspend-resume hooks. Perhaps, adding them like in the commit
>>> ac8bd9e13be2 ("r8169: Disable clk during suspend / resume") would
>>> help.
> 
> I missed this change while i was away. It's indeed the expectation that the audio mclk is handled by the firmware,

I believe that the statement "It's indeed the expectation that the
audio mclk is handled by the firmware" is not correct for BYT/CHT
platforms (and the commit causing the regression only affects BYT/CHT
platforms). Various machine drivers under sound/soc/intel/boards have
code to deal with the mclk themselves, like this:

         drv->mclk = devm_clk_get(&pdev->dev, "pmc_plt_clk_3");
...

         if (SND_SOC_DAPM_EVENT_ON(event)) {
                 ret = clk_prepare_enable(ctx->mclk);
		...
         } else {
                 clk_disable_unprepare(ctx->mclk);
         }

The above code is from sound/soc/intel/boards/cht_bsw_max98090_ti.c
which is the machine driver used on the machines for which problems
are now being reported.

And my commit introducing the problem is in essence a revert of:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=d31fd43c0f9a41e2678a1e78c0f22f0384c6edd3

Which is from 2017-07-14 and the ASoC code for BYT/CHT platforms is
much older then that.

Also I asked Carlo why he wrote that patch and it was to fix a problem
with ethernet on some laptops.

> not sure I understand why removing the CLK_IS_CRITICAL was necessary or what it has to do with S0ix.

It is necessary, because if it is set the clock never gets disabled and
on x86 platforms using suspend2idle we are responsible for *all* the hardware
power-management as OS. If we do not disable the clocks then we can only
reach S0i1 instead of S0i3 when suspended leading to increased battery
drain during suspend.

Also note that this patch is only causing problems on CHT + max98090 codec
using machines. I've tested it on many other BYT/CHT machines myself and
we've not had any other bug reports related to this.

So this clearly points to a problem with the clock management in the
cht_bsw_max98090_ti.c machine driver.

I've some ideas how to fix this and I will prepare some patches to test.

Regards,

Hans

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ