[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPY8ntD1F=Lskoayc9YHKvvh59TcZe7zdfq5H=M5mbNHraGUDQ@mail.gmail.com>
Date: Tue, 31 Oct 2023 15:34:14 +0000
From: Dave Stevenson <dave.stevenson@...pberrypi.com>
To: Stefan Wahren <wahrenst@....net>
Cc: mike.isely@...altdigital.com, Andi Shyti <andi.shyti@...nel.org>,
Florian Fainelli <florian.fainelli@...adcom.com>,
Phil Elwell <phil@...pberrypi.com>,
Mike Isely <isely@...ox.com>,
Broadcom internal kernel review list
<bcm-kernel-feedback-list@...adcom.com>,
Ray Jui <rjui@...adcom.com>,
Scott Branden <sbranden@...adcom.com>,
linux-rpi-kernel@...ts.infradead.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] [i2c-bcm2835] ALWAYS enable INTD
(Thanks Stefan for forwarding - linux-rpi-kernel seems to be a little
too aggressive on spam filtering, and I'm not on the other lists
cc'ed).
Hi Mike
On Tue, 31 Oct 2023 at 12:37, Stefan Wahren <wahrenst@....net> wrote:
>
> [Forware to Dave and Phil]
>
> Am 30.10.23 um 17:21 schrieb mike.isely@...altdigital.com:
> > From: Mike Isely <mike.isely@...altdigital.com>
> >
> > There is a race in the bcm2835 i2c hardware: When one starts a write
> > transaction, two things apparently take place at the same time: (1) an
> > interrupt is posted to cause the FIFO to be filled with TX data,
> > and (2) an I2C transaction is started on the wire with the slave
> > select byte. The race happens if there's no slave, as this causes a
> > slave selection timeout, raising the ERR flag in the hardware and
> > setting DONE. The setting of that DONE flag races against TXW. If
> > TXW gets set first, then an interrupt is raised if INTT was set. If
> > ERR gets set first, then an interrupt is raised if INTD was set. It's
> > one or the other, not both - probably because DONE being set disables
> > the hardware INTT interrupt path.
I'm not following the full sequence of events required here.
If you only had a slave selection message, then num_msgs = 1 and INTD
will be enabled immediately anyway.
I did investigate some I2C issues back in May due to observed issues
between one of the camera modules and the DSI screen touch controller.
If memory serves correctly, the biggest issue I found was that
aborting the transaction when active just left SDA & SCL in whatever
was the current state, including midway through a byte and no stop
condition. I didn't find a valid way to do a controlled stop, and
therefore ended up with a patch that will always complete the
transaction before looking at the status flags [1]. (Yes, I really
should upstream those patches).
For a linked thread[2] I think I found that the ERR flag wasn't
signalled until the end of the complete transaction.
Hang on, if you're always enabling BCM2835_I2C_C_INTD, then if we have
a write of N bytes and read of M bytes, don't we get a DONE after the
write, meaning that the ISR completes then due to the clause at [3]
and we never do the read? Something feels wrong here.
Dave
[1] https://github.com/raspberrypi/linux/pull/5479/commits
[2] https://forums.raspberrypi.com/viewtopic.php?p=2098691#p2098691
[3] https://github.com/torvalds/linux/blob/master/drivers/i2c/busses/i2c-bcm2835.c#L293-L306
> >
> > MOST of the time, TXW gets set first, the ISR runs, sees ERR is set
> > and cleanly fails the transaction. However some of the time DONE gets
> > set first - but since the driver doesn't enable INTD until it's on the
> > last message - there's no interrupt at all. Thus the ISR never fires
> > and the driver detects a timeout instead. At best, the "wrong" error
> > code is delivered to the owner of the transaction. At worst, if the
> > timeout doesn't propertly clean up the hardware (see prior commit
> > fixing that), the next - likely unrelated - transaction will get
> > fouled, leading to bizarre behavior in logic otherwise unrelated to
> > the source of the original error.
> >
> > The fix here is to set INTD on for all messages not just the last one.
> > In that way, unexpected failures which might set DONE earlier than
> > expected will always trigger an interrupt and be handled correctly.
> >
> > The datasheet for this hardware doesn't describe any scenario where
> > the hardware can realistically hang - even a stretched clock will be
> > noticed if it takes too long. So in theory a timeout should really
> > NEVER happen, and with this fix I was completely unable to trigger any
> > further timeouts at all.
> >
> > Signed-off-by: Mike Isely <isely@...ox.com>
> > ---
> > drivers/i2c/busses/i2c-bcm2835.c | 6 +-----
> > 1 file changed, 1 insertion(+), 5 deletions(-)
> >
> > diff --git a/drivers/i2c/busses/i2c-bcm2835.c b/drivers/i2c/busses/i2c-bcm2835.c
> > index 96de875394e1..70005c037ff9 100644
> > --- a/drivers/i2c/busses/i2c-bcm2835.c
> > +++ b/drivers/i2c/busses/i2c-bcm2835.c
> > @@ -235,26 +235,22 @@ static void bcm2835_drain_rxfifo(struct bcm2835_i2c_dev *i2c_dev)
> >
> > static void bcm2835_i2c_start_transfer(struct bcm2835_i2c_dev *i2c_dev)
> > {
> > - u32 c = BCM2835_I2C_C_ST | BCM2835_I2C_C_I2CEN;
> > + u32 c = BCM2835_I2C_C_ST | BCM2835_I2C_C_I2CEN | BCM2835_I2C_C_INTD;
> > struct i2c_msg *msg = i2c_dev->curr_msg;
> > - bool last_msg = (i2c_dev->num_msgs == 1);
> >
> > if (!i2c_dev->num_msgs)
> > return;
> >
> > i2c_dev->num_msgs--;
> > i2c_dev->msg_buf = msg->buf;
> > i2c_dev->msg_buf_remaining = msg->len;
> >
> > if (msg->flags & I2C_M_RD)
> > c |= BCM2835_I2C_C_READ | BCM2835_I2C_C_INTR;
> > else
> > c |= BCM2835_I2C_C_INTT;
> >
> > - if (last_msg)
> > - c |= BCM2835_I2C_C_INTD;
> > -
> > bcm2835_i2c_writel(i2c_dev, BCM2835_I2C_A, msg->addr);
> > bcm2835_i2c_writel(i2c_dev, BCM2835_I2C_DLEN, msg->len);
> > bcm2835_i2c_writel(i2c_dev, BCM2835_I2C_C, c);
> > }
>
Powered by blists - more mailing lists