lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 3 Aug 2020 14:57:46 -0700
From:   Doug Anderson <dianders@...omium.org>
To:     John Stultz <john.stultz@...aro.org>
Cc:     Linus Walleij <linus.walleij@...aro.org>,
        Rajendra Nayak <rnayak@...eaurora.org>,
        Maulik Shah <mkshah@...eaurora.org>,
        Marc Zyngier <maz@...nel.org>,
        Lina Iyer <ilina@...eaurora.org>,
        Cheng-Yi Chiang <cychiang@...omium.org>,
        Stephen Boyd <swboyd@...omium.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Andy Gross <agross@...nel.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>,
        "open list:GPIO SUBSYSTEM" <linux-gpio@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Todd Kjos <tkjos@...gle.com>,
        Amit Pundir <amit.pundir@...aro.org>
Subject: Re: [PATCH v3] pinctrl: qcom: Handle broken/missing PDC dual edge
 IRQs on sc7180

Hi,

On Mon, Aug 3, 2020 at 2:06 PM John Stultz <john.stultz@...aro.org> wrote:
>
> On Tue, Jul 14, 2020 at 8:08 AM Douglas Anderson <dianders@...omium.org> wrote:
> >
> > Depending on how you look at it, you can either say that:
> > a) There is a PDC hardware issue (with the specific IP rev that exists
> >    on sc7180) that causes the PDC not to work properly when configured
> >    to handle dual edges.
> > b) The dual edge feature of the PDC hardware was only added in later
> >    HW revisions and thus isn't in all hardware.
> >
> > Regardless of how you look at it, let's work around the lack of dual
> > edge support by only ever letting our parent see requests for single
> > edge interrupts on affected hardware.
> >
> > NOTE: it's possible that a driver requesting a dual edge interrupt
> > might get several edges coalesced into a single IRQ.  For instance if
> > a line starts low and then goes high and low again, the driver that
> > requested the IRQ is not guaranteed to be called twice.  However, it
> > is guaranteed that once the driver's interrupt handler starts running
> > its first instruction that any new edges coming in will cause the
> > interrupt to fire again.  This is relatively commonplace for dual-edge
> > gpio interrupts (many gpio controllers require software to emulate
> > dual edge with single edge) so client drivers should be setup to
> > handle it.
> >
> > Fixes: e35a6ae0eb3a ("pinctrl/msm: Setup GPIO chip in hierarchy")
> > Signed-off-by: Douglas Anderson <dianders@...omium.org>
>
> Just as a heads up. I started seeing boot failures (crashes really
> early before we get serial output) with db845c when testing with the
> android-mainline tree that pulled v5.8 in.

Even before earlycon?  Ick.  For me earlycon comes up way before
pinctrl and I thought that, by design, earlycon came up so dang early
that you could debug almost anything with it.

To confirm, I could even drop into earlycon_kgdb (which starts later
than earlycon), then set a breakpoint on msm_pinctrl_probe() and I'd
hit my breakpoint.  Enabling earlycon should be super easy these
days--just add the "earlycon" command line parameter and the kernel
seems to do the rest of the magic based on the "stdout-path".  I guess
if your bootloader doesn't cooperate and leave the system in an OK
state then you'll be in bad shape, but otherwise it should be nice...

NOTE: if you have earlycon and this is still causing crashes before
earlycon starts, the only things I can think of are side effects of
this patch.  Could it have made your kernel just a little too big and
now you're overflowing some hard limit of the bootloader?  Maybe
you're hitting a ccache bug and using some stale garbage (don't laugh,
this happened to me the other year)?  Maybe there's a pointer bug and
this moves addresses just enough to make it cause havoc?


> I did some quick bisection and came down to this patch, and sure
> enough things boot again with this patch reverted.
>
> In my testing earlier today with v5.8 (+ just a few patches for db845c
> support), I didn't see this failure, but the configs in use are
> different there.
>
> I'll try to spend a bit of time to understand exactly what is failing,
> but if you have any initial suggestions for things to try, I'd
> appreciate it.

So on SDM845 we aren't setting "wakeirq_dual_edge_errata", right?
It's possible that you also need it, but I didn't have an SDM845
device in front of me to test with--I only have remote access to one.
...but in any case, the fact that SDM845 doesn't have
"wakeirq_dual_edge_errata" set should eliminate a bunch of code.

Once you eliminate that there's almost nothing left of this patch.
You could try commenting out:

irq_set_handler_locked(d, handle_fasteoi_irq);

...and see if that helps?

NOTE: I just tried putting kernel 5.8 on my sdm845-cheza device.  It
booted up without crashing...  I'm probably not using the same config
you are, but at least it appears that sdm845 isn't totally broken or
anything...

-Doug

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ