lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGETcx8DNDQZ_ikjJD9pybt9-jMeb399am1k3xHOjuTFn1imTg@mail.gmail.com>
Date:   Thu, 6 Aug 2020 23:22:50 -0700
From:   Saravana Kannan <saravanak@...gle.com>
To:     Bjorn Andersson <bjorn.andersson@...aro.org>
Cc:     John Stultz <john.stultz@...aro.org>,
        Steev Klimaszewski <steev@...i.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Jason Cooper <jason@...edaemon.net>,
        Marc Zyngier <maz@...nel.org>,
        Matthias Brugger <matthias.bgg@...il.com>,
        Andy Gross <agross@...nel.org>,
        Android Kernel Team <kernel-team@...roid.com>,
        lkml <linux-kernel@...r.kernel.org>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        "moderated list:ARM/Mediatek SoC support" 
        <linux-mediatek@...ts.infradead.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>,
        Hanks Chen <hanks.chen@...iatek.com>,
        CC Hwang <cc.hwang@...iatek.com>,
        Loda Chou <loda.chou@...iatek.com>,
        Thierry Reding <thierry.reding@...il.com>,
        Nicolas Saenz Julienne <nsaenzjulienne@...e.de>
Subject: Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using
 IRQCHIP_PLATFORM_DRIVER helper macros

On Thu, Aug 6, 2020 at 10:58 PM Bjorn Andersson
<bjorn.andersson@...aro.org> wrote:
>
> On Thu 06 Aug 19:48 PDT 2020, John Stultz wrote:
>
> > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson
> > <bjorn.andersson@...aro.org> wrote:
> > > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote:
> > > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson
> > > > <bjorn.andersson@...aro.org> wrote:
> > > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote:
> > > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski <steev@...i.org> wrote:
> > > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote:
> > > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote:
> > > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@...aro.org> wrote:
> > > > > > > >>> <sigh>
> > > > > > > >>> So this is where I bashfully admit I didn't get a chance to try this
> > > > > > > >>> patch series out, as I had success with a much older version of
> > > > > > > >>> Saravana's macro magic.
> > > > > > > >>>
> > > > > > > >>> But unfortunately, now that this has landed in mainline, I'm seeing
> > > > > > > >>> boot regressions on db845c. :( This is in the non-modular case,
> > > > > > > >>> building the driver in.
> > > > > > > >> Does that mean the modular version is working? Or you haven't tried
> > > > > > > >> that yet? I'll wait for your reply before I try to fix it. I don't
> > > > > > > >> have the hardware, but it should be easy to guess this issue looking
> > > > > > > >> at the code delta.
> > > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started on -next
> > > > > > > > around 20200727, but I didn't track it down as, well, there's less way
> > > > > > > > to get debug output on the C630.
> > > > > > > >
> > > > > > > > In my testing, module or built-in doesn't matter, but reverting does
> > > > > > > > allow me to boot again.
> > > > > > > >
> > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit reverted
> > > > > > > boots, however, module (on the c630 at least) doesn't boot whether it's
> > > > > > > a module or built-in.
> > > > > >
> > > > > > You may need to set deferred_probe_timeout=30 to give things a bit
> > > > > > more grace time to load.
> > > > >
> > > > > With the risk of me reading more into this than what you're saying,
> > > > > please don't upstream anything that depend this parameter to be
> > > > > increased.
> > > > >
> > > > > Compiling any of these drivers as module should not require the user to
> > > > > pass additional kernel command line parameters in order to get their
> > > > > device to boot.
> > > >
> > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to
> > > > avoid it. But the reality is that it is already required (at least in
> > > > configurations heavily using modules) to give more time for modules
> > > > loaded to resolve missing dependencies after init begins (due to
> > > > changes in the driver core to fail loading after init so that optional
> > > > dt links aren't eternally looked for). This was seen when trying to
> > > > enable the qualcom clk drivers to modules.
> > > >
> > >
> > > So to clarify what you're saying, any system that boots successfully
> > > with the default options is a sign of pure luck - regardless of being
> > > builtin or modules.
> > >
> > >
> > > And there you have my exact argument against the deferred timeout magic
> > > going on in the driver core. But as you know people insist that it's
> > > more important to be able to boot some defunct system from NFS than a
> > > properly configured one reliably.
> >
> > I'd agree, but the NFS case was in use before, and when the original
> > deferred timeout/optional link handling stuff landed no one complained
> > they were broken by it (at least at the point where it landed).
>
> I did object when this was proposed and I've objected for the last two
> years, because we keep adding more and more subsystems to follow this
> broken behavior.
>
> > Only later when we started enabling more lower-level core drivers as
> > modules did the shortened dependency resolution time start to bite
> > folks.  My attempt to set the default to be 30 seconds helped there,
> > but caused trouble and delays for the NFS case, and "don't break
> > existing users" seemed to rule, so I set the default timeout back to
> > 0.
> >
>
> I can't argue with that and I'm at loss on how to turn this around.
>
> > > > It doesn't seem necessary in this case, but I suggested it here as
> > > > I've got it enabled by default in my AOSP builds so that the
> > > > module-heavy configs for GKI boot properly (even if Saravana's
> > > > fw_devlink work is disabled).
> > > >
> > >
> > > With all due respect, that's your downstream kernel, the upstream kernel
> > > should not rely on luck, out-of-tree patches or kernel parameters.
> >
> > I agree that would be preferred. But kernel parameters are often there
> > for these sorts of cases where we can't always do the right thing.
> > As for out-of-tree patches, broken things don't get fixed until
> > out-of-tree patches are developed and upstreamed, and I know Saravana
> > is doing exactly that, and I hope his fw_devlink work helps fix it so
> > the module loading is not just a matter of luck.
> >
>
> I don't agree with this, upstream should be functional in its default
> configuration. Out-of-tree patches might be necessary to enable features
> or get the most out of the hardware, but what we have upstream should
> work.  And no, this is not always the case, but we should at least aim
> for this.
>
> > Also I think Thierry's comments in the other thread today are also
> > good ideas for ways to better handle the optional dt link handling
> > (rather than using a timeout).
> >
>
> I'll take a look at that, but to repeat what I've said many times
> before, for Qualcomm platforms there's pretty much no such thing as
> optional links.

Nicolas had suggested earlier that we could have a way to enable
fw_devlink=on in DT under the "chosen" node. I liked that idea, but
wasn't sure how easy it would be to convince DT maintainers to allow
it. If that happens, we could just set that and not have to worry
about these timeouts for QC platforms.

Not sure how easy it is to update DT in a DB 845c though. But for
cases where DT can't be updated, we'd still have the crappy timeout to
plaster over the issue.

Something like this or whatever sensible name that suggests that the
dependencies in DT are reliable enough to use for probe ordering,
deferred probe, etc. Then once that's there, we can turn on
fw_devlink=on and that'd completely skip this timeout code path (and
has a bunch of other benefits).

chosen {
        linux,reliable-dt-dependencies;
}

Just some food for thought.

Also, Thierry's idea would work, but it'll need changes to multiple
drivers. Also, I don't think it'll work for drivers that'll need to
ignore vs not ignore dependencies based on what board they are used
in. It's still better than the blind timeout though.

-Saravana

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ