lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 5 Apr 2022 10:13:11 +0300
From:   Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
To:     Avi Fishman <avifishman70@...il.com>
Cc:     Tali Perry <tali.perry1@...il.com>,
        Tyrone Ting <warp5tw@...il.com>,
        Tomer Maimon <tmaimon77@...il.com>,
        Patrick Venture <venture@...gle.com>,
        Nancy Yuen <yuenn@...gle.com>,
        Benjamin Fair <benjaminfair@...gle.com>,
        Rob Herring <robh+dt@...nel.org>,
        Krzysztof Kozlowski <krzysztof.kozlowski@...onical.com>,
        yangyicong@...ilicon.com, semen.protsenko@...aro.org,
        Wolfram Sang <wsa@...nel.org>, jie.deng@...el.com,
        sven@...npeter.dev, bence98@....bme.hu,
        Lukas Bulwahn <lukas.bulwahn@...il.com>,
        Arnd Bergmann <arnd@...db.de>, olof@...om.net,
        Tali Perry <tali.perry@...oton.com>,
        Avi Fishman <Avi.Fishman@...oton.com>,
        Tomer Maimon <tomer.maimon@...oton.com>, KWLIU@...oton.com,
        JJLIU0@...oton.com, kfting@...oton.com,
        OpenBMC Maillist <openbmc@...ts.ozlabs.org>,
        Linux I2C <linux-i2c@...r.kernel.org>,
        devicetree <devicetree@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 09/11] i2c: npcm: Handle spurious interrupts

On Mon, Apr 04, 2022 at 08:03:44PM +0300, Avi Fishman wrote:
> On Thu, Mar 3, 2022 at 4:14 PM Andy Shevchenko
> <andriy.shevchenko@...ux.intel.com> wrote:
> > On Thu, Mar 03, 2022 at 02:48:20PM +0200, Tali Perry wrote:
> > > > On Thu, Mar 3, 2022 at 12:37 PM Andy Shevchenko <andriy.shevchenko@...ux.intel.com> wrote:
> > > > > On Thu, Mar 03, 2022 at 04:31:39PM +0800, Tyrone Ting wrote:
> > > > > > From: Tali Perry <tali.perry1@...il.com>
> > > > > >
> > > > > > In order to better handle spurious interrupts:
> > > > > > 1. Disable incoming interrupts in master only mode.
> > > > > > 2. Clear end of busy (EOB) after every interrupt.
> > > > > > 3. Return correct status during interrupt.
> > > > >
> > > > > This is bad commit message, it doesn't explain "why" you are doing these.
> >
> > ...
> >
> > > BMC users connect a huge tree of i2c devices and muxes.
> > > This tree suffers from spikes, noise and double clocks.
> > > All these may cause spurious interrupts to the BMC.

(1)

> > > If the driver gets an IRQ which was not expected and was not handled
> > > by the IRQ handler,
> > > there is nothing left to do but to clear the interrupt and move on.
> >
> > Yes, the problem is what "move on" means in your case.
> > If you get a spurious interrupts there are possibilities what's wrong:
> > 1) HW bug(s)
> > 2) FW bug(s)
> > 3) Missed IRQ mask in the driver
> > 4) Improper IRQ mask in the driver
> >
> > The below approach seems incorrect to me.
> 
> Andy, What about this explanation:
> On rare cases the i2c gets a spurious interrupt which means that we
> enter an interrupt but in
> the interrupt handler we don't find any status bit that points to the
> reason we got this interrupt.
> This may be a rare case of HW issue that is still under investigation.
> In order to overcome this we are doing the following:
> 1. Disable incoming interrupts in master mode only when slave mode is
> not enabled.
> 2. Clear end of busy (EOB) after every interrupt.
> 3. Clear other status bits (just in case since we found them cleared)
> 4. Return correct status during the interrupt that will finish the transaction.
> On next xmit transaction if the bus is still busy the master will
> issue a recovery process before issuing the new transaction.

This sounds better, thanks.

One thing to clarify, the (1) states that the HW "issue" is known and becomes a
PCB level one, i.e. noisy environment that has not been properly shielded.
So, if it is known, please put the reason in the commit message.

Also would be good to see numbers of "rare". Is it 0.1%?

> > > If the transaction failed, driver has a recovery function.
> > > After that, user may retry to send the message.
> > >
> > > Indeed the commit message doesn't explain all this.
> > > We will fix and add to the next patchset.
> > >
> > > > > > +     /*
> > > > > > +      * if irq is not one of the above, make sure EOB is disabled and all
> > > > > > +      * status bits are cleared.
> > > > >
> > > > > This does not explain why you hide the spurious interrupt.
> > > > >
> > > > > > +      */

-- 
With Best Regards,
Andy Shevchenko


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ