lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 6 Jun 2023 19:40:15 +0000
From:   Robert Hancock <robert.hancock@...ian.com>
To:     "andi.shyti@...nel.org" <andi.shyti@...nel.org>
CC:     "michal.simek@....com" <michal.simek@....com>,
        "shubhraj@...inx.com" <shubhraj@...inx.com>,
        "marex@...x.de" <marex@...x.de>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-i2c@...r.kernel.org" <linux-i2c@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "wsa@...nel.org" <wsa@...nel.org>
Subject: Re: [PATCH] i2c: xiic: Don't try to handle more interrupt events
 after error

On Tue, 2023-06-06 at 21:24 +0200, Andi Shyti wrote:
> Hi Robert,
> 
> On Tue, Jun 06, 2023 at 12:25:58PM -0600, Robert Hancock wrote:
> > In xiic_process, it is possible that error events such as
> > arbitration
> > lost or TX error can be raised in conjunction with other interrupt
> > flags
> > such as TX FIFO empty or bus not busy. Error events result in the
> > controller being reset and the error returned to the calling
> > request,
> > but the function could potentially try to keep handling the other
> > events, such as by writing more messages into the TX FIFO. Since
> > the
> > transaction has already failed, this is not helpful and will just
> > cause
> > issues.
> 
> what kind of issues?
> 

The one I ran into is what I alluded to further down, where that
WARN_ON was triggering repeatedly, which ended up flooding the kernel
log and causing the device's watchdog timer to timeout. I'm not sure
what other forms of havoc might ensue, other than "nothing good"..

> > This problem has been present ever since:
> > 
> > commit 7f9906bd7f72 ("i2c: xiic: Service all interrupts in isr")
> > 
> > which allowed non-error events to be handled after errors, but
> > became
> > more obvious after:
> > 
> > commit 743e227a8959 ("i2c: xiic: Defer xiic_wakeup() and
> > __xiic_start_xfer() in xiic_process()")
> > 
> > which reworked the code to add a WARN_ON which triggers if both the
> > xfer_more and wakeup_req flags were set, since this combination is
> > not supposed to happen, but was occurring in this scenario.
> > 
> > Skip further interrupt handling after error flags are detected to
> > avoid
> > this problem.
> > 
> > Fixes: 7f9906bd7f72 ("i2c: xiic: Service all interrupts in isr")
> > Signed-off-by: Robert Hancock <robert.hancock@...ian.com>
> 
> please add:
> 
> Cc: <stable@...r.kernel.org> # v4.3+
> 

Can add for a v2 - although with the Fixes tag it would most likely be
picked up for stable anyway..

> > ---
> >  drivers/i2c/busses/i2c-xiic.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/i2c/busses/i2c-xiic.c
> > b/drivers/i2c/busses/i2c-xiic.c
> > index 8a3d9817cb41..ee6edc963dea 100644
> > --- a/drivers/i2c/busses/i2c-xiic.c
> > +++ b/drivers/i2c/busses/i2c-xiic.c
> > @@ -721,6 +721,8 @@ static irqreturn_t xiic_process(int irq, void
> > *dev_id)
> >                       wakeup_req = 1;
> >                       wakeup_code = STATE_ERROR;
> >               }
> > +             /* don't try to handle other events */
> > +             goto out;
> 
> why don't we have goto's after every irq evaluation but only
> here? Do the issues you mentioned happen olny in this particular
> error case?
> 

As far as I can tell, yes. For example you could legitimately have
XIIC_INTR_TX_EMPTY_MASK and XIIC_INTR_BNB_MASK both being handled. The
error case is special as we end up resetting the controller, so it
doesn't make sense to try to continue with the rest of the transaction
while also completing it with an error.

> Thanks,
> Andi
> 
> >       }
> >       if (pend & XIIC_INTR_RX_FULL_MASK) {
> >               /* Receive register/FIFO is full */
> > --
> > 2.40.1
> > 

-- 
Robert Hancock <robert.hancock@...ian.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ