lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx-RWyNgdyZ77i_7wQ8U2bCYQ7i0M5xswK-3iJ8PPWPW5w@mail.gmail.com>
Date:   Thu, 11 Feb 2021 19:04:28 -0800
From:   Saravana Kannan <saravanak@...gle.com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     Jon Hunter <jonathanh@...dia.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Android Kernel Team <kernel-team@...roid.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Jisheng Zhang <Jisheng.Zhang@...aptics.com>,
        Kevin Hilman <khilman@...libre.com>,
        John Stultz <john.stultz@...aro.org>,
        Nicolas Saenz Julienne <nsaenzjulienne@...e.de>,
        Marc Zyngier <maz@...nel.org>,
        linux-tegra <linux-tegra@...r.kernel.org>
Subject: Re: [PATCH v1 0/5] Enable fw_devlink=on by default

On Thu, Feb 11, 2021 at 9:48 AM Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Thu, Feb 11, 2021 at 6:15 PM Saravana Kannan <saravanak@...gle.com> wrote:
> >
> > On Thu, Feb 11, 2021 at 7:03 AM Rafael J. Wysocki <rafael@...nel.org> wrote:
> > >
> > > On Thu, Feb 11, 2021 at 1:02 AM Saravana Kannan <saravanak@...gle.com> wrote:
> > > >
> > > > On Thu, Jan 28, 2021 at 7:03 AM Jon Hunter <jonathanh@...dia.com> wrote:
> > > > >
> > > > >
> > > > > On 14/01/2021 16:56, Jon Hunter wrote:
> > > > > >
> > > > > > On 14/01/2021 16:47, Saravana Kannan wrote:
> > > > > >
> > > > > > ...
> > > > > >
> > > > > >>> Yes this is the warning shown here [0] and this is coming from
> > > > > >>> the 'Generic PHY stmmac-0:00' device.
> > > > > >>
> > > > > >> Can you print the supplier and consumer device when this warning is
> > > > > >> happening and let me know? That'd help too. I'm guessing the phy is
> > > > > >> the consumer.
> > > > > >
> > > > > >
> > > > > > Sorry I should have included that. I added a print to dump this on
> > > > > > another build but failed to include here.
> > > > > >
> > > > > > WARNING KERN Generic PHY stmmac-0:00: supplier 2200000.gpio (status 1)
> > > > > >
> > > > > > The status is the link->status and looks like the supplier is the
> > > > > > gpio controller. I have verified that the gpio controller is probed
> > > > > > before this successfully.
> > > > > >
> > > > > >> So the warning itself isn't a problem -- it's not breaking anything or
> > > > > >> leaking memory or anything like that. But the device link is jumping
> > > > > >> states in an incorrect manner. With enough context of this code (why
> > > > > >> the device_bind_driver() is being called directly instead of going
> > > > > >> through the normal probe path), it should be easy to fix (I'll just
> > > > > >> need to fix up the device link state).
> > > > > >
> > > > > > Correct, the board seems to boot fine, we just get this warning.
> > > > >
> > > > >
> > > > > Have you had chance to look at this further?
> > > >
> > > > Hi Jon,
> > > >
> > > > I finally got around to looking into this. Here's the email[1] that
> > > > describes why it's done this way.
> > > >
> > > > [1] - https://lore.kernel.org/lkml/YCRjmpKjK0pxKTCP@lunn.ch/
> > > >
> > > > >
> > > > > The following does appear to avoid the warning, but I am not sure if
> > > > > this is the correct thing to do ...
> > > > >
> > > > > index 9179825ff646..095aba84f7c2 100644
> > > > > --- a/drivers/base/dd.c
> > > > > +++ b/drivers/base/dd.c
> > > > > @@ -456,6 +456,10 @@ int device_bind_driver(struct device *dev)
> > > > >  {
> > > > >         int ret;
> > > > >
> > > > > +       ret = device_links_check_suppliers(dev);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > > +
> > > > >         ret = driver_sysfs_add(dev);
> > > > >         if (!ret)
> > > > >                 driver_bound(dev);
> > > >
> > > > So digging deeper into the usage of device_bind_driver and looking at
> > > > [1], it doesn't look like returning an error here is a good option.
> > > > When device_bind_driver() is called, the driver's probe function isn't
> > > > even called. So, there's no way for the driver to even defer probing
> > > > based on any of the suppliers. So, we have a couple of options:
> > > >
> > > > 1. Delete all the links to suppliers that haven't bound.
> > >
> > > Or maybe convert them to stateless links?  Would that be doable at all?
> >
> > Yeah, I think it should be doable.
> >
> > >
> > > > We'll still leave the links to active suppliers alone in case it helps with
> > > > suspend/resume correctness.
> > > > 2. Fix the warning to not warn on suppliers that haven't probed if the
> > > > device's driver has no probe function. But this will also need fixing
> > > > up the cleanup part when device_release_driver() is called. Also, I'm
> > > > not sure if device_bind_driver() is ever called when the driver
> > > > actually has a probe() function.
> > > >
> > > > Rafael,
> > > >
> > > > Option 1 above is pretty straightforward.
> > >
> > > I would prefer this ->
> >
> > Ok
> >
> > >
> > > > Option 2 would look something like what's at the end of this email +
> > > > caveat about whether the probe check is sufficient.
> > >
> > > -> because "fix the warning" really means that we haven't got the
> > > device link state machine right and getting it right may imply a major
> > > redesign.
> > >
> > > Overall, I'd prefer to take a step back and allow things to stabilize
> > > for a while to let people catch up with this.
> >
> > Are you referring to if/when we implement Option 2? Or do you want to
> > step back for a while even before implementing Option 1?
>
> I would do option 1 and if then see what happens and maybe go back
> from there if need be until getting a reasonably stable situation
> (that is all of the systems that used to work before still work at
> least).

Ok, I'll implement Option 1 soon. Also, thinking more about it, I
don't like converting it into STATELESS links. It's easy to do, but it
doesn't feel right for the driver core to "create" a STATELESS link
and then "forget" about it. So, when a device is force bound, I'll
just delete the links where the suppliers haven't probed yet.

-Saravana

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ