linux-kernel - Re: [PATCH] spi: spi-geni-qcom: Fix NULL pointer access in geni_spi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <160773672053.1580929.15441111796129112926@swboyd.mtv.corp.google.com>
Date:   Fri, 11 Dec 2020 17:32:00 -0800
From:   Stephen Boyd <swboyd@...omium.org>
To:     Doug Anderson <dianders@...omium.org>
Cc:     Roja Rani Yarubandi <rojay@...eaurora.org>,
        Mark Brown <broonie@...nel.org>,
        Andy Gross <agross@...nel.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>,
        linux-spi <linux-spi@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Akash Asthana <akashast@...eaurora.org>,
        msavaliy@....qualcomm.com
Subject: Re: [PATCH] spi: spi-geni-qcom: Fix NULL pointer access in geni_spi_isr

Quoting Doug Anderson (2020-12-10 17:51:53)
> Hi,
> 
> On Thu, Dec 10, 2020 at 5:39 PM Stephen Boyd <swboyd@...omium.org> wrote:
> >
> > Quoting Doug Anderson (2020-12-10 17:30:17)
> > > On Thu, Dec 10, 2020 at 5:21 PM Stephen Boyd <swboyd@...omium.org> wrote:
> > > >
> > > > Yeah and so if it comes way later because it timed out then what's the
> > > > point of calling synchronize_irq() again? To make the completion
> > > > variable set when it won't be tested again until it is reinitialized?
> > >
> > > Presumably the idea is to try to recover to a somewhat usable state
> > > again?  We're not rebooting the machine so, even though this transfer
> > > failed, we will undoubtedly do another transfer later.  If that
> > > "abort" interrupt comes way later while we're setting up the next
> > > transfer we'll really confuse ourselves.
> >
> > The interrupt handler just sets a completion variable. What does that
> > confuse?
> 
> The interrupt handler sees a "DONE" interrupt.  If we've made it far
> enough into setting up the next transfer that "cur_xfer" has been set
> then it might do more, no?

I thought it saw a cancel/abort EN bit?

        if (m_irq & M_CMD_CANCEL_EN)
                complete(&mas->cancel_done);
        if (m_irq & M_CMD_ABORT_EN)
                complete(&mas->abort_done)

and only a DONE bit if a transfer happened.

> 
> 
> > > I guess you could go the route of adding a synchronize_irq() at the
> > > start of the next transfer, but I'd rather add the overhead in the
> > > exceptional case (the timeout) than the normal case.  In the normal
> > > case we don't need to worry about random IRQs from the past transfer
> > > suddenly showing up.
> > >
> >
> > How does adding synchronize_irq() at the end guarantee that the abort is
> > cleared out of the hardware though? It seems to assume that the abort is
> > pending at the GIC when it could still be running through the hardware
> > and not executed yet. It seems like a synchronize_irq() for that is
> > wishful thinking that the irq is merely pending even though it timed
> > out and possibly never ran. Maybe it's stuck in a write buffer in the
> > CPU?
> 
> I guess I'm asserting that if a full second passed (because we timed
> out) and after that full second no interrupts are pending then the
> interrupt will never come.  That seems a reasonable assumption to me.
> It seems hard to believe it'd be stuck in a write buffer for a full
> second?
> 

Ok, so if we don't expect an irq to come in why are we calling
synchronize_irq()? I'm lost.