lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx82Y8PBDJ2V5JbRGfzz96gZ3tS9hRP-774dQd-+k4s2MA@mail.gmail.com>
Date:   Tue, 27 Apr 2021 14:05:20 -0700
From:   Saravana Kannan <saravanak@...gle.com>
To:     Florian Fainelli <f.fainelli@...il.com>
Cc:     Sudeep Holla <sudeep.holla@....com>,
        Cristian Marussi <cristian.marussi@....com>,
        Jim Quinlan <james.quinlan@...adcom.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Al Cooper <alcooperx@...il.com>,
        Michael Walle <michael@...le.cc>,
        Jon Hunter <jonathanh@...dia.com>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Guenter Roeck <linux@...ck-us.net>,
        Android Kernel Team <kernel-team@...roid.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1 3/3] Revert "Revert "driver core: Set fw_devlink=on by default""

On Tue, Apr 27, 2021 at 9:47 AM Florian Fainelli <f.fainelli@...il.com> wrote:
>
>
>
> On 4/27/2021 9:24 AM, Saravana Kannan wrote:
> > On Tue, Apr 27, 2021 at 8:10 AM Sudeep Holla <sudeep.holla@....com> wrote:
> >>
> >> On Tue, Apr 27, 2021 at 03:11:16PM +0100, Cristian Marussi wrote:
> >>> On Tue, Apr 27, 2021 at 09:33:31AM -0400, Jim Quinlan wrote:
> >> [...]
> >>>>>
> >>>> I believe that the brcmstb-mbox node is in our DT for backwards
> >>>> compatibility with our older Linux only.   Note that  we use the compatible
> >>>> string '"arm,scmi-smc", "arm,scmi"'; the former chooses SMC transport and
> >>>> ignores custom  mailboxes such as brcmstb-mbox.
> >>>>
> >>>
> >>> Right..so it is even more wrong that it is waiting for the mailboxes...but
> >>> looking at the DT:
> >>>
> >>> brcm_scmi_mailbox@0 {
> >>>                 #mbox-cells = <0x01>;
> >>>                 compatible = "brcm,brcmstb-mbox";
> >>>                 status = "disabled";
> >>>                 linux,phandle = <0x04>;
> >>>                 phandle = <0x04>;
> >>>         };
> >>>
> >>> brcm_scmi@0 {
> >>>                 compatible = "arm,scmi-smc\0arm,scmi";
> >>>                 mboxes = <0x04 0x00 0x04 0x01>;
> >>>                 mbox-names = "tx\0rx";
> >>>                 shmem = <0x05>;
> >>>                 status = "disabled";
> >>>                 arm,smc-id = <0x83000400>;
> >>>                 interrupt-names = "a2p";
> >>>                 #address-cells = <0x01>;
> >>>                 #size-cells = <0x00>;
> >>>
> >>> it seems to me that even though you declare an SMC based transport (and in fact
> >>> you define the smc-id too) you also still define mboxes (as a fallback I suppose)
> >>> referring to the brcm_scmi_mailbox phandle, and while this is ignored by the SCMI
> >>> driver (because you have selected a compatible SMC transport) I imagine this dep
> >>> is picked up by fw_devlink which in fact says:
> >>>
> >>>> [    0.300086] platform brcm_scmi@0: Linked as a consumer to brcm_scmi_mailbox@0
> >>>
> >>> and stalls waiting for it. (but I'm not really familiar on how fw_devlink
> >>> internals works really...so I maybe off in these regards)
> >
> > Cristian,
> >
> > Great debugging work for not having worked on this before! Your
> > comments about the dependencies are right. If you grep the logs for
> > "probe deferral", you'll see these lines and more:
> >
> > [    0.942998] platform brcm_scmi@0: probe deferral - supplier
> > brcm_scmi_mailbox@0 not ready
> > [    3.622741] platform 8b20000.pcie: probe deferral - supplier
> > brcm_scmi@0 not ready
> > [    5.695929] platform 840c000.serial: probe deferral - supplier
> > brcm_scmi@0 not ready
> >
> > Florian,
> >
> > Sorry I wasn't clear in my earlier email. I was asking for the path to
> > the board file DT in upstream so I could look at it and the files it
> > references. I didn't mean to ask for an "decompiled" DTS attachment.
> > The decompiled ones make it a pain to track the phandles.
>
> Our Device Tree sources are not in the kernel since the bootloader
> provides a FDT to the kernel which is massaged in different ways to
> support a single binary for a multitude of reference boards and chip
> variants. That FDT is 90% auto-generated offline from scripts and about
> 10% runtime patched based on our whim. We should probably still aim for
> some visibility into these Device Tree files by the kernel community.
>
> >
> > The part that's confusing to me is that the mbox node is disabled in
> > the DT you attached. fw_devlink is smart enough to ignore disabled
> > nodes. Is it getting enabled by the bootloader? Can you please try
> > deleting the reference to the brcm_scmi_mailbox from the scmi node and
> > see if it helps? Or leave it really disabled?
>
> Removing the 'mboxes' phandle works, see my other reply to Sudeep and I
> should have captured the DT from the Linux prompt after it has been
> finalized and where the mbox node is marked as enabled unfortunately.
>
> >
> > Also, as a separate test of workarounds, can you please add
> > deferred_probe_timeout=1 to your commandline and see if it helps? I'm
> > assuming you have modules enabled? Otherwise, the existing smarts in
> > fw_devlink to ignore devices with no drivers would have kicked in too.
>
> deferred_probe_timeout=1 does help however all of these drivers are
> built into the kernel at the moment and so ultimately we reach
> user-space with no console driver registered.

Whether all the required drivers are built in already or not doesn't
matter for this workaround. fw_devlink can't tell if you are just
about to load a module that'll probe the mailbox. If CONFIG_MODULES is
disabled, then it can tell no more drivers are getting loaded by the
time you hit late_initcall_sync() and it would have automatically
applied this workaround without deferred_probe_timeout=1.

>
> >
> >> I was about to mention/ask the same when I saw Jim's reply. I see you have
> >> already asked that. Couple of my opinions based on my very limited knowledge
> >> on fw_devlink and how it works.
> >>
> >> 1. Since we have different compatible for SMC and mailbox, I am not sure
> >>    if it correct to leave mailbox information in scmi node. Once we have
> >>    proper yaml scheme, we must flag that error IMO.
> >>
> >> 2. IIUC, the fw_devlink might use information from DT to establish the
> >>    dependency and having mailbox information in this context may be
> >>    considered wrong as there is no dependency if it is using SMC.
> >
> > If this mbox reference from scmi is wrong for the current kernel and
> > never used, then I'd recommend deleting that.
>
> Yes that seems to be the way forward unless we want to set
> fw_devlink=permissive on the command line, either should hopefully be an
> option.

I read all the other emails from Sudeep, Geert and you. I'll just
respond to all of them here.

My preferred order of the workarouds:
1. Fix the DT sent to the kernel.
2. If deferred_probe_timeout=1 doesn't break anything else, use that.
This is better than (4).
3. Geert's early boot quirk suggestion.
4. fw_devlink=permissive (least preferred because this might mask
issues with fw_devlink=on in your future changes).

Changing the SCMI driver itself won't help fw_devlink.

Thanks,
Saravana

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ