lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMRc=Mc=QFizXRDS7=_v6K9Z7c7dSBZdLbW9zCR01LFSk08MEg@mail.gmail.com>
Date:   Fri, 2 Mar 2018 18:39:41 +0100
From:   Bartosz Golaszewski <brgl@...ev.pl>
To:     David Lechner <david@...hnology.com>
Cc:     Bartosz Golaszewski <bgolaszewski@...libre.com>,
        linux-clk@...r.kernel.org,
        linux-devicetree <devicetree@...r.kernel.org>,
        arm-soc <linux-arm-kernel@...ts.infradead.org>,
        Michael Turquette <mturquette@...libre.com>,
        Stephen Boyd <sboyd@...eaurora.org>,
        Rob Herring <robh+dt@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Sekhar Nori <nsekhar@...com>,
        Kevin Hilman <khilman@...nel.org>,
        Adam Ford <aford173@...il.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v7 10/42] clk: davinci: New driver for davinci PSC clocks

2018-03-01 17:44 GMT+01:00 David Lechner <david@...hnology.com>:
> On 03/01/2018 02:36 AM, Bartosz Golaszewski wrote:
>>
>> 2018-02-28 22:40 GMT+01:00 David Lechner <david@...hnology.com>:
>>>
>>> On 02/28/2018 06:38 AM, Bartosz Golaszewski wrote:
>>>>
>>>>
>>>>
>>>> I think I found the reason for the strange crashes we were
>>>> experiencing (emac core->name being NULL) thanks to Sekhar who pointed
>>>> me in the right direction.
>>>>
>>>> The mdio driver fails to probe with v7 due to the supplied clock rate
>>>> being wrong. Before failing we register the emac clock with
>>>> pm_clk_add_clk(). When clock_ops puts the clock, it decreases the
>>>> reference count of the clock, but we never actually increased it in
>>>> the first place in the line above. The core clock code then destroys
>>>> the associated clk_core structure. When the next user comes around (in
>>>> our case the clk debug functions) the system crashes.
>>>>
>>>> I believe there to be two issues: one is with v7 - we need to increase
>>>> the clock reference count in davinci_psc_genpd_attach_dev().
>>>>
>>>> Second is the error path in the clock framework - we should remove the
>>>> destroyed clk_core from the debug list, which is not being done now.
>>>>
>>>> Why we even need to track the refcount of clk_core is a mistery for me
>>>> though. Stephen, Mike?
>>>>
>>>> Best regards,
>>>> Bartosz Golaszewski
>>>
>>>
>>>
>>> Great find. I figured it had to be something like this, but I wasn't
>>> able to reproduce the problem yet.
>>>
>>> I suppose it is time to spin up a v8 with some fixes.
>>
>>
>> I still don't know why the mdio clock rate is much lower than in
>> mainline though. Any ideas?
>>
>> Thanks,
>> Bart
>>
>
> Now that you have fixed the crash, can you answer the questions I have
> asked earlier?
>
>> Can you post the output of this command so that I can see how your
>
> clocks are setup:
>
> cat /sys/kernel/debug/clk/clk_summary
>
>> Using your workaround, can you run:
>
>
> cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
>
> If you see:
>   1e27000.clock-controller: emac  off-0
>
> then genpd is not working like it is supposed to. You should see something
> like this for device that are working:
>           1e27000.clock-controller: uart2  on
>     /devices/platform/soc@...0000/1d0d000.serial        active

I used of_clk_get() in the genpd attach callback so the crash no
longer happens, but I still can't boot it over NFS due to mdio
failing. Do you have any idea why the clock rate differs between v7
and mainline?

>From the logs I can see that genpd domains are correctly registered,
and the provider is added (you should probably skip setting up the
domains in legacy mode though), the pm clocks are enabled (after being
disabled by mdio after its failed probe()) but the boot process gets
stuck after the kernel gets an IP address over DHCP (which is strange
because apparently it had some kind of network connection).

On Monday I'll prepare a small ramfs and boot over tftp only and see from there.

Best regards,
Bartosz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ