lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230220162137.xjeowlc4qd3rtzc2@ripper>
Date:   Mon, 20 Feb 2023 08:21:37 -0800
From:   Bjorn Andersson <andersson@...nel.org>
To:     Stephen Boyd <sboyd@...nel.org>
Cc:     Abel Vesa <abel.vesa@...aro.org>, Andy Gross <agross@...nel.org>,
        Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
        Konrad Dybcio <konrad.dybcio@...aro.org>,
        Mike Turquette <mturquette@...libre.com>,
        linux-clk@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-arm-msm@...r.kernel.org, mka@...omium.org
Subject: Re: [PATCH v3 1/2] clk: Add generic sync_state callback for
 disabling unused clocks

On Fri, Feb 17, 2023 at 09:38:22PM -0800, Stephen Boyd wrote:
> Quoting Abel Vesa (2022-12-27 12:45:27)
> > There are unused clocks that need to remain untouched by clk_disable_unused,
> > and most likely could be disabled later on sync_state. So provide a generic
> > sync_state callback for the clock providers that register such clocks.
> > Then, use the same mechanism as clk_disable_unused from that generic
> > callback, but pass the device to make sure only the clocks belonging to
> > the current clock provider get disabled, if unused. Also, during the
> > default clk_disable_unused, if the driver that registered the clock has
> > the generic clk_sync_state_disable_unused callback set for sync_state,
> > skip disabling its clocks.
> 
> How does that avoid disabling clks randomly in the clk tree? I'm
> concerned about disabling an unused clk in the middle of the tree
> because it doesn't have a driver using sync state, while the clk is the
> parent of an unused clk that is backed by sync state.
> 
>    clk A -->  clk B 
> 
> clk A: No sync state
> clk B: sync state
> 
> clk B is left on by the bootloader. __clk_disable_unused(NULL) is called
> from late init. Imagine clk A is the root of the tree.
> 
> 	clk_disable_unused_subtree(clk_core A)
> 	  clk_disable_unused_subtree(clk_core B)
> 	    if (from_sync_state && core->dev != dev)
> 	      return;
> 	  ...
> 	  clk core A->ops->disable()
> 
> clk core B is off now?
> 

I will have to give this some more thought. But this is exactly what we
have today; consider A being any builtin clock driver and B being any
clock driver built as modules, with relationship to A.

clk_disable_unused() will take down A without waiting for B, possibly
locking up parts of the clock hardware of B; or turning off the clocks
to IP blocks that rely on those clocks (e.g. display).

So my thought on this is that I don't think this patch negatively alter
the situation. But as it isn't recursive, this remains a problem that
needs to be fixed.

> Also sync_state seems broken right now. I saw mka mentioned that if you
> have a device node enabled in your DT but never enable a driver for it
> in the kernel we'll never get sync_state called. This is another
> problem, but it concerns me that sync_state would make the unused clk
> disabling happen at some random time or not at all.
> 

I don't think that sync_state is "broken".

There is no way to distinguish a driver not being built in, or a driver
being built as module but not yet loaded. The approach taken by
sync_state currently is optimistically speculative.

One alternative to this is found in the regulator framework, where we
have a 30 second timer triggering the late disable. The result of this
is that every time I end up in the ramdisk console because "root file
system can't be mounted", I have 25 second to figure out what the
problem is before the backlight goes out...

As such I do prefer the optimistic approach...

> Can the problem be approached more directly? If this is about fixing
> continuous splash screen, then I wonder why we can't list out the clks
> that we know are enabled by the bootloader in some new DT binding, e.g.:
> 
> 	clock-controller {
> 		#clock-cells = <1>;
> 		boot-handoff-clocks = <&consumer_device "clock cells for this clk provider">;
> 	};
> 

I was under the impression that we have ruled out this approach.

Presumably this list should not be a manually maintained list of display
clocks, and that means the bootloader would need to go in and  build
this list of all enabled clocks. I don't think this is practical.

> Then mark those as "critical/don't turn off" all the way up the clk tree
> when the clk driver probes by essentially incrementing the
> prepare/enable count but not actually touching the hardware, and when
> the clks are acquired by clk_get() for that device that's using them
> from boot we make the first clk_prepare_enable() do nothing and not
> increment the count at all. We can probably stick some flag into the
> 'struct clk' for this when we create the handle in clk_get() so that the
> prepare and enable functions can special case and skip over.
> 

The benefit of sync_state is that it kicks when the client drivers has
probed. As such, you can have e.g. the display driver clk_get(), then
probe defer on some other resource, and the clock state can remain
untouched.

> The sync_state hook operates on a driver level, which is too large when
> you consider that a single clk driver may register hundreds of clks that
> are not related. We want to target a solution at the clk level so that
> any damage from keeping on all the clks provided by the controller is
> limited to just the drivers that aren't probed and ready to handle their
> clks. If sync_state could be called whenever a clk consumer consumes a
> clk it may work? Technically we already have that by the clk_hw_provider
> function but there isn't enough information being passed there, like the
> getting device.
> 

The current solution already operates on all clocks of all drivers, that
happens to be probed at late_initcall(). This patch removes the
subordinate clause from this, allowing clock drivers and their clients
to be built as modules.

So while it still operates on all clocks of a driver, it moves that
point to a later stage, where that is more reasonable to do.



It would probably (haven't considered all aspects) if sync_state could
prune the tree gradually, disabling the branches that are fully probed.

But it wouldn't affect Matthias problem; e.g. if you forget to build the
venus driver, sync_state won't happen for that branch of the tree.
(Something that is arguably better than leaving all the clocks for that
driver enabled)

Regards,
Bjorn

> > diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h
> > index 842e72a5348f..cf1adfeaf257 100644
> > --- a/include/linux/clk-provider.h
> > +++ b/include/linux/clk-provider.h
> > @@ -720,6 +720,7 @@ struct clk *clk_register_divider_table(struct device *dev, const char *name,
> >                 void __iomem *reg, u8 shift, u8 width,
> >                 u8 clk_divider_flags, const struct clk_div_table *table,
> >                 spinlock_t *lock);
> > +void clk_sync_state_disable_unused(struct device *dev);
> 
> This is a weird place to put this. Why not in the helper functions
> section?
> 
> >  /**
> >   * clk_register_divider - register a divider clock with the clock framework
> >   * @dev: device registering this clock

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ