lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <db550aa0-ed92-8e25-10bf-06a78d771048@codeaurora.org>
Date:   Tue, 30 Apr 2019 14:52:40 -0600
From:   Jeffrey Hugo <jhugo@...eaurora.org>
To:     Stephen Boyd <sboyd@...nel.org>, mturquette@...libre.com
Cc:     bjorn.andersson@...aro.org, georgi.djakov@...aro.org,
        linux-arm-msm@...r.kernel.org, linux-clk@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1] clk: Probe defer clk_get() on orphans

On 4/22/2019 6:52 PM, Stephen Boyd wrote:
> Quoting Jeffrey Hugo (2019-02-11 10:57:47)
>> If a parent to a clock comes from outside that clock's provider, the parent
>> may not be present at the time the clock is registered (ie the parent comes
>> from another driver that has not yet probed).  The clock can still be
>> registered, and a reference to it obtained, however that clock may not be
>> fully functional - ie get_rate might return an invalid value.
>>
>> This has been a problem that has resulted in the UART console breaking on
>> some Qualcomm SoCs, as the UART baud rate is based on a clock that is the
>> child of XO.  Due to the large chain of dependencies, its possible that the
>> RPM has not provided XO by the time that the UART driver probes, gets the
>> baud rate clock, and calls get_rate - which returns 0 and results in a bad
>> configuration.
>>
>> An orphan clock is a clock that is missing a parent or some other ancestor.
>> Since the parent is defined, we can assume that it is expected to appear at
>> some point in a properly configured system (all bets are off if a required
>> driver is not compiled, etc), and it is unlikely that the clock can be
>> properly consumed during the time the clock is an orphan.  Therefore,
>> return EPROBE_DEFER for orphan clocks so that consumers wait until the
>> parent chain is established, and proper clock operation can occur.
>>
>> Signed-off-by: Jeffrey Hugo <jhugo@...eaurora.org>
>> ---
>>
>> This is based upon the "Rewrite clk parent handling" series at [1], and assumes
>> that the suspected missing line commented on at [2] is added.
>>
>> The idea for this solution came from [3] and [4].
>>
>> [1] https://lore.kernel.org/lkml/20190129061021.94775-1-sboyd@kernel.org/T/#u
>> [2] https://lkml.org/lkml/2019/2/11/1634
>> [3] https://lkml.org/lkml/2019/2/6/382
>> [4] https://lkml.org/lkml/2015/12/27/209
> 
> There have been multiple attempts over the years to support probe defer
> for clks that don't have parents. If you search the kernel mailing list
> archives I'm sure you'll come across them (for example
> https://patchwork.kernel.org/patch/6313051/). That's why we have the
> first part of the code to indicate if a clk is an orphan or not, i.e.
> commit e6500344edbb ("clk: track the orphan status of clocks and their
> children"), but not this patch that you've sent.
> 
> There are a couple requirements that we need to make sure we don't break
> first.
> 
>   1. clk_get() should work for clks on the orphan list if that clk is
>   parented to something that will never be registered with the framework
> 
>   2. We need a way for drivers to express that the parent of a clk
>   won't exist
> 
>   3. Critical clks need to turn clks on even if they'll never get
>   parents registered
> 
> We've had problems in the past
> (http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/343007.html)
> where bootloaders configure clks in certain ways that the kernel doesn't
> care to even consider as possible. In these cases we either need to let
> clk_set_rate() reparent them when consumers are ready or we need to
> convert drivers that are forcing on clks early to use the
> CLK_IS_CRITICAL flag to turn on clks even if they're never going to find
> their parents.
> 
> Last time we tried to do this in 2015 (wow so many years ago!) we were
> blocked on not having the critical clks infrastructure and the legacy
> sunxi clk driver needed to convert to DT and critical clks flag to keep
> working. I think we could have done it a year or two ago, because sunxi
> moved to a new design, but then we got more use-cases where clks may
> never get the parent they're currently configured for in the bootloader
> and then the kernel would never hand out the clk to consumers and the
> clk_set_rate() case would fail.
> 
> To fix that last part, I'm proposing we introduce the .get_parent_hw()
> op and then rely on drivers to tell the framework that the parent is
> there either with a direct pointer reference or by knowing that the
> DT/firmware is telling us the parent is valid. If we just rely on string
> names and a u8 to indicate parents then we don't have enough information
> to figure out how the parent is provided and if it will ever appear at
> some point in the future. Once we have a way to describe this through
> DT/firmware then we're able to indicate the clk is an orphan when that's
> actually the case vs. when the clk is configured in hardware for
> something that we won't know about. You can see this work in the
> clk-parent-rewrite series in clk.git.
> 
> There's also one more problem, which is what we do with clks that we've
> handed out to consumers and then the driver for the parent of that clk
> is removed and the parent is unregistered. Right now, we move these clks
> to the orphan list and set the clk_nodrv_ops on the parent that's
> unregistered. We probably need to set clk_nodrv_ops on all the children
> that get orphaned, and remove the cached clk_core pointer in all the
> clk_core::parents members (even ones that aren't currently using it!),
> and stash away the original clk_ops so we can restore them later when
> the clk is properly reparented if the parent comes back. It's a lot of
> book keeping to remove the dangling references and let it come back
> later. I haven't started on this part, but it's on my radar.
> 

Ugh.  Ok.  Much more work to be done.  I guess we live with the fake XO 
in GCC hack for a while yet then.

-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ