lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <93dcf61a-be80-8870-48f5-30e215cc56ed@baylibre.com>
Date:   Fri, 20 Nov 2020 10:26:47 +0100
From:   Neil Armstrong <narmstrong@...libre.com>
To:     Marc Zyngier <maz@...nel.org>, Jerome Brunet <jbrunet@...libre.com>
Cc:     Guillaume Tucker <guillaume.tucker@...labora.com>,
        kernelci-results@...ups.io, Kevin Hilman <khilman@...libre.com>,
        linux-arm-kernel@...ts.infradead.org,
        linux-amlogic@...ts.infradead.org,
        Martin Blumenstingl <martin.blumenstingl@...glemail.com>,
        David Airlie <airlied@...ux.ie>,
        Daniel Vetter <daniel@...ll.ch>,
        dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: next/master bisection: baseline.dmesg.emerg on meson-gxbb-p200

On 19/11/2020 19:35, Marc Zyngier wrote:
> On 2020-11-19 18:13, Jerome Brunet wrote:
>> On Thu 19 Nov 2020 at 19:04, Guillaume Tucker
>> <guillaume.tucker@...labora.com> wrote:
>>
>>> Hi Marc,
>>>
>>> On 19/11/2020 11:58, Marc Zyngier wrote:
>>>> On 2020-11-19 10:26, Neil Armstrong wrote:
>>>>> On 19/11/2020 11:20, Marc Zyngier wrote:
>>>>>> On 2020-11-19 08:50, Guillaume Tucker wrote:
>>>>>>> Please see the automated bisection report below about some kernel
>>>>>>> errors on meson-gxbb-p200.
>>>>>>>
>>>>>>> Reports aren't automatically sent to the public while we're
>>>>>>> trialing new bisection features on kernelci.org, however this one
>>>>>>> looks valid.
>>>>>>>
>>>>>>> The bisection started with next-20201118 but the errors are still
>>>>>>> present in next-20201119.  Details for this regression:
>>>>>>>
>>>>>>>   https://kernelci.org/test/case/id/5fb6196bfd0127fd68d8d902/
>>>>>>>
>>>>>>> The first error is:
>>>>>>>
>>>>>>>   [   14.757489] Internal error: synchronous external abort: 96000210
>>>>>>> [#1] PREEMPT SMP
>>>>>>
>>>>>> Looks like yet another clock ordering setup. I guess different Amlogic
>>>>>> platforms have slightly different ordering requirements.
>>>>>>
>>>>>> Neil, do you have any idea of which platform requires which ordering?
>>>>>> The variability in DT and platforms is pretty difficult to follow (and
>>>>>> I don't think I have such board around).
>>>>>
>>>>> The requirements should be the same, here the init was done before calling
>>>>> dw_hdmi_probe to be sure the clocks and internals resets were deasserted.
>>>>> But since you boot from u-boot already enabling these, it's already active.
>>>>>
>>>>> The solution would be to revert and do some check in meson_dw_hdmi_init() to
>>>>> check if already enabled and do nothing.
>>>>
>>>> A better fix seems to be this, which makes it explicit that there is
>>>> a dependency between some of the registers accessed from meson_dw_hdmi_init()
>>>> and the iahb clock.
>>>>
>>>> Guillaume, can you give this a go on your failing box?
>>>
>>> I confirm it solves the problem.  Please add this to your fix
>>> patch if it's OK with you:
>>>
>>>   Reported-by: "kernelci.org bot" <bot@...nelci.org>
>>>   Tested-by: Guillaume Tucker <guillaume.tucker@...labora.com>
>>>
>>>
>>> For the record, it passed all the tests when applied on top of
>>> the "bad" revision found by the bisection:
>>>
>>>   http://lava.baylibre.com:10080/scheduler/alljobs?search=v5.10-rc3-1021-gb8668a2e5ea1
>>>
>>> and the exact same test on the "bad" revision without the fix
>>> consistently showed the error:
>>>
>>>   http://lava.baylibre.com:10080/scheduler/job/374176
>>>
>>>
>>> Thanks,
>>> Guillaume
>>>
>>>
>>>> diff --git a/drivers/gpu/drm/meson/meson_dw_hdmi.c b/drivers/gpu/drm/meson/meson_dw_hdmi.c
>>>> index 7f8eea494147..52af8ba94311 100644
>>>> --- a/drivers/gpu/drm/meson/meson_dw_hdmi.c
>>>> +++ b/drivers/gpu/drm/meson/meson_dw_hdmi.c
>>>> @@ -146,6 +146,7 @@ struct meson_dw_hdmi {
>>>>      struct reset_control *hdmitx_ctrl;
>>>>      struct reset_control *hdmitx_phy;
>>>>      struct clk *hdmi_pclk;
>>>> +    struct clk *iahb_clk;
>>>>      struct clk *venci_clk;
>>>>      struct regulator *hdmi_supply;
>>>>      u32 irq_stat;
>>>> @@ -1033,6 +1034,13 @@ static int meson_dw_hdmi_bind(struct device *dev, struct device *master,
>>>>      }
>>>>      clk_prepare_enable(meson_dw_hdmi->hdmi_pclk);
>>>>
>>>> +    meson_dw_hdmi->iahb_clk = devm_clk_get(dev, "iahb");
>>>> +    if (IS_ERR(meson_dw_hdmi->iahb_clk)) {
>>>> +        dev_err(dev, "Unable to get iahb clk\n");
>>>> +        return PTR_ERR(meson_dw_hdmi->iahb_clk);
>>>> +    }
>>>> +    clk_prepare_enable(meson_dw_hdmi->iahb_clk);


On previous SoCs, iahb was directly the bus clock (clk81), and on recent socs
this clock is a gate.

The question is why is it disabled. Maybe a previous failed probe disabled it
in the dw-hdmi probe failure code and this clock is needed for meson_dw_hdmi_init(),
so yeah this is the right fix.

Thanks.

Could you send a revert of b33340e33acdfe5ca6a5aa1244709575ae1e0432 and then proper fix with clk_disable_unprepare added ?

>>
>> If you guys are going ahead with this fix, this call to
>> clk_prepare_enable() needs to be balanced with clk_disable_unprepare() somehow
> 
> Yup, good point.
> 
> Although this driver *never* disables any clock it enables, and leaves it
> to the main DW driver, which I guess makes it leak references.
> 
> So all 3 clocks need fixing.

Exact.

Thx Guillaume for testing,

Neil

> 
> Thanks,
> 
>         M.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ