[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <42f62311-b541-4b0f-8b90-ca1a5dfe1e6c@notapiano>
Date: Fri, 30 Aug 2024 09:55:41 -0400
From: Nícolas F. R. A. Prado <nfraprado@...labora.com>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: "Rafael J. Wysocki" <rjw@...ysocki.net>,
Linux PM <linux-pm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Lukasz Luba <lukasz.luba@....com>, Zhang Rui <rui.zhang@...el.com>,
regressions@...ts.linux.dev, kernelci@...ts.linux.dev,
kernel@...labora.com
Subject: Re: [PATCH v3 00/14] thermal: Rework binding cooling devices to trip
points
On Mon, Aug 26, 2024 at 11:58:12AM +0200, Rafael J. Wysocki wrote:
> On Sat, Aug 24, 2024 at 8:45 PM Nícolas F. R. A. Prado
> <nfraprado@...labora.com> wrote:
> >
> > On Mon, Aug 19, 2024 at 05:49:07PM +0200, Rafael J. Wysocki wrote:
> > > Hi Everyone,
> > >
> > > This is one more update of
> > >
> > > https://lore.kernel.org/linux-pm/3134863.CbtlEUcBR6@rjwysocki.net/#r
> > >
> > > the cover letter of which was sent separately by mistake:
> > >
> > > https://lore.kernel.org/linux-pm/CAJZ5v0jo5vh2uD5t4GqBnN0qukMBG_ty33PB=NiEqigqxzBcsw@mail.gmail.com/
> > >
> > > and it has been updated once already:
> > >
> > > https://lore.kernel.org/linux-pm/114901234.nniJfEyVGO@rjwysocki.net/
> > >
> > > Relative to the v2 above it drops 3 patches, one because it was broken ([04/17
> > > in the v2), and two more that would need to be rebased significantly, either
> > > because of dropping the other broken patch or because of the recent Bang-bang
> > > governor fixes:
> > >
> > > https://lore.kernel.org/linux-pm/1903691.tdWV9SEqCh@rjwysocki.net/
> > >
> > > The remaining 14 patches, 2 of which have been slightly rebased and the rest
> > > is mostly unchanged (except for some very minor subject and changelog fixes),
> > > is not expected to be controversial and are targeting 6.12, on top of the
> > > current linux-next material.
> > >
> > > The original motivation for this series quoted below has not changed:
> > >
> > > The code for binding cooling devices to trip points (and unbinding them from
> > > trip point) is one of the murkiest pieces of the thermal subsystem. It is
> > > convoluted, bloated with unnecessary code doing questionable things, and it
> > > works backwards.
> > >
> > > The idea is to bind cooling devices to trip points in accordance with some
> > > information known to the thermal zone owner (thermal driver). This information
> > > is not known to the thermal core when the thermal zone is registered, so the
> > > driver needs to be involved, but instead of just asking the driver whether
> > > or not the given cooling device should be bound to a given trip point, the
> > > thermal core expects the driver to carry out all of the binding process
> > > including calling functions specifically provided by the core for this
> > > purpose which is cumbersome and counter-intuitive.
> > >
> > > Because the driver has no information regarding the representation of the trip
> > > points at the core level, it is forced to walk them (and it has to avoid some
> > > locking traps while doing this), or it needs to make questionable assumptions
> > > regarding the ordering of the trips in the core. There are drivers doing both
> > > these things.
> > >
> > > The first 5 patches in the series are preliminary.
> > >
> > > Patch [06/14] introduces a new .should_bind() callback for thermal zones and
> > > patches [07,09-12/14] modifies drivers to use it instead of the .bind() and
> > > .unbind() callbacks which allows them to be simplified quite a bit.
> > >
> > > The other patches [08,13-14/14] get rid of code that becomes unused after the
> > > previous changes and do some cleanups on top of that.
> > >
> > > The entire series along with 2 patches on top of it (that were present in the
> > > v2 of this set of patches) is available in the thermal-core-testing git branch:
> > >
> > > https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/log/?h=thermal-core-testing
> > >
> > > (note that this branch is going to be rebased shortly on top of 6.11-rc4
> > > and the thermal control material in linux-next).
> > >
> > > Thanks!
> >
> > Hi,
> >
> > KernelCI has identified a boot regression originating from this series. I've
> > verified that reverting the series fixes the issue.
>
> Thanks for the report!
>
> There was a bug in the original patch [12/14] that would cause
> symptoms like what you are observing to appear, which was reported on
> Friday and has since been fixed in the tree. Please see:
>
> https://lore.kernel.org/linux-pm/CAJZ5v0iw7uXE_cfU5VXOjFDg9GM8Hu0+hKxqfzU3v0OM5KK9oQ@mail.gmail.com/
>
> You probably have not tested the fixed tree yet, so please let
> kernelci run again on it and if the issue is still there, please let
> me know.
Indeed it has been fixed.
#regzbot fix: 'thermal/of: Use the .should_bind() thermal zone callback'
Thanks,
Nícolas
Powered by blists - more mailing lists