[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+J-oUvm-3G9GRCzjOd+j8K6iNs1piCFAKBNfwih49iFwiB4pA@mail.gmail.com>
Date: Tue, 24 Jun 2025 18:32:14 +0800
From: Zhang Jian <zhangjian.3032@...edance.com>
To: Paul Menzel <pmenzel@...gen.mpg.de>
Cc: Ryan Chen <ryan_chen@...eedtech.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>, Joel Stanley <joel@....id.au>,
Andi Shyti <andi.shyti@...nel.org>, Andrew Jeffery <andrew@...econstruct.com.au>,
linux-i2c@...r.kernel.org, openbmc@...ts.ozlabs.org,
linux-arm-kernel@...ts.infradead.org, linux-aspeed@...ts.ozlabs.org,
linux-kernel@...r.kernel.org
Subject: Re: [External] Re: [PATCH] i2c: aspeed: change debug level in irq handler
Hi Paul;
Thanks for your reply and sorry for the late reply, I was trying to
figure out why this log occurred,
it's quite hard to reproduce.
I traced all the master and slave states, and eventually found that
the behavior matches
the description in commit b4cc1cbba519. The issue has already been
fixed in that commit
it was caused by a state machine bug that led to the interrupt not
being handled correctly.
see: https://github.com/torvalds/linux/commit/b4cc1cbba519
(The state transitions between the master and slave here rely on interrupts.
When the signal waveform is incomplete (such as during power off/on),
it may cause state errors or brief unresponsiveness, resulting in some
log prints.)
On Thu, Jun 19, 2025 at 7:18 AM Paul Menzel <pmenzel@...gen.mpg.de> wrote:
>
> Dear Jian,
>
>
> Thank you for the patch.
>
> Am 18.06.25 um 12:21 schrieb Jian Zhang:
> > In interrupt context, using dev_err() can potentially cause latency
> > or affect system responsiveness due to printing to console.
> >
> > In our scenario, under certain conditions, i2c1 repeatedly printed
> > "irq handled != irq. expected ..." around 20 times within 1 second.
>
> Any idea, why you hit this error at all?
>
> > Each dev_err() log introduced approximately 10ms of blocking time,
> > which delayed the handling of other interrupts — for example, i2c2.
> >
> > At the time, i2c2 was performing a PMBus firmware upgrade. The
> > target device on i2c2 was time-sensitive, and the upgrade protocol
> > was non-retryable. As a result, the delay caused by frequent error
> > logging led to a timeout and ultimately a failed firmware upgrade.
> >
> > Frequent error printing in interrupt context can be dangerous,
> > as it introduces latency and interferes with time-critical tasks.
> > This patch changes the log level from dev_err() to dev_dbg() to
> > reduce potential impact.
>
> Thank you for the patch and the problem description. Hiding an error
> condition behind debug level is also not good, as administrators might
> miss hardware issues. I do not have a solution. Is there something
> similar to WARN_ONCE? Maybe the level should be a warning instead of
> error, because the system is often able to cope with this?
Yeah, I'm a bit unsure as well. Maybe I can use dev_err_ratelimited()?
>
> The code is from 2017, so should be well tested actually, shouldn’t it?
>
> > Signed-off-by: Jian Zhang <zhangjian.3032@...edance.com>
> > ---
> > drivers/i2c/busses/i2c-aspeed.c | 18 +++++++++---------
> > 1 file changed, 9 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> > index 1550d3d552ae..38e23c826f39 100644
> > --- a/drivers/i2c/busses/i2c-aspeed.c
> > +++ b/drivers/i2c/busses/i2c-aspeed.c
> > @@ -317,7 +317,7 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> > switch (bus->slave_state) {
> > case ASPEED_I2C_SLAVE_READ_REQUESTED:
> > if (unlikely(irq_status & ASPEED_I2CD_INTR_TX_ACK))
> > - dev_err(bus->dev, "Unexpected ACK on read request.\n");
> > + dev_dbg(bus->dev, "Unexpected ACK on read request.\n");
> > bus->slave_state = ASPEED_I2C_SLAVE_READ_PROCESSED;
> > i2c_slave_event(slave, I2C_SLAVE_READ_REQUESTED, &value);
> > writel(value, bus->base + ASPEED_I2C_BYTE_BUF_REG);
> > @@ -325,7 +325,7 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> > break;
> > case ASPEED_I2C_SLAVE_READ_PROCESSED:
> > if (unlikely(!(irq_status & ASPEED_I2CD_INTR_TX_ACK))) {
> > - dev_err(bus->dev,
> > + dev_dbg(bus->dev,
> > "Expected ACK after processed read.\n");
> > break;
> > }
> > @@ -354,7 +354,7 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> > /* Slave was just started. Waiting for the next event. */;
> > break;
> > default:
> > - dev_err(bus->dev, "unknown slave_state: %d\n",
> > + dev_dbg(bus->dev, "unknown slave_state: %d\n",
> > bus->slave_state);
> > bus->slave_state = ASPEED_I2C_SLAVE_INACTIVE;
> > break;
> > @@ -459,7 +459,7 @@ static u32 aspeed_i2c_master_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> >
> > /* We are in an invalid state; reset bus to a known state. */
> > if (!bus->msgs) {
> > - dev_err(bus->dev, "bus in unknown state. irq_status: 0x%x\n",
> > + dev_dbg(bus->dev, "bus in unknown state. irq_status: 0x%x\n",
> > irq_status);
> > bus->cmd_err = -EIO;
> > if (bus->master_state != ASPEED_I2C_MASTER_STOP &&
> > @@ -523,7 +523,7 @@ static u32 aspeed_i2c_master_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> > irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> > goto error_and_stop;
> > } else if (unlikely(!(irq_status & ASPEED_I2CD_INTR_TX_ACK))) {
> > - dev_err(bus->dev, "slave failed to ACK TX\n");
> > + dev_dbg(bus->dev, "slave failed to ACK TX\n");
> > goto error_and_stop;
> > }
> > irq_handled |= ASPEED_I2CD_INTR_TX_ACK;
> > @@ -546,7 +546,7 @@ static u32 aspeed_i2c_master_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> > fallthrough;
> > case ASPEED_I2C_MASTER_RX:
> > if (unlikely(!(irq_status & ASPEED_I2CD_INTR_RX_DONE))) {
> > - dev_err(bus->dev, "master failed to RX\n");
> > + dev_dbg(bus->dev, "master failed to RX\n");
> > goto error_and_stop;
> > }
> > irq_handled |= ASPEED_I2CD_INTR_RX_DONE;
> > @@ -577,7 +577,7 @@ static u32 aspeed_i2c_master_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> > goto out_no_complete;
> > case ASPEED_I2C_MASTER_STOP:
> > if (unlikely(!(irq_status & ASPEED_I2CD_INTR_NORMAL_STOP))) {
> > - dev_err(bus->dev,
> > + dev_dbg(bus->dev,
> > "master failed to STOP. irq_status:0x%x\n",
> > irq_status);
> > bus->cmd_err = -EIO;
> > @@ -589,7 +589,7 @@ static u32 aspeed_i2c_master_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> > bus->master_state = ASPEED_I2C_MASTER_INACTIVE;
> > goto out_complete;
> > case ASPEED_I2C_MASTER_INACTIVE:
> > - dev_err(bus->dev,
> > + dev_dbg(bus->dev,
> > "master received interrupt 0x%08x, but is inactive\n",
> > irq_status);
> > bus->cmd_err = -EIO;
> > @@ -665,7 +665,7 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
> >
> > irq_remaining &= ~irq_handled;
> > if (irq_remaining)
> > - dev_err(bus->dev,
> > + dev_dbg(bus->dev,
> > "irq handled != irq. expected 0x%08x, but was 0x%08x\n",
> > irq_received, irq_handled);
> >
>
>
> Kind regards,
>
> Paul
Jian.
Powered by blists - more mailing lists