lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+J-oUvm-3G9GRCzjOd+j8K6iNs1piCFAKBNfwih49iFwiB4pA@mail.gmail.com>
Date: Tue, 24 Jun 2025 18:32:14 +0800
From: Zhang Jian <zhangjian.3032@...edance.com>
To: Paul Menzel <pmenzel@...gen.mpg.de>
Cc: Ryan Chen <ryan_chen@...eedtech.com>, 
	Benjamin Herrenschmidt <benh@...nel.crashing.org>, Joel Stanley <joel@....id.au>, 
	Andi Shyti <andi.shyti@...nel.org>, Andrew Jeffery <andrew@...econstruct.com.au>, 
	linux-i2c@...r.kernel.org, openbmc@...ts.ozlabs.org, 
	linux-arm-kernel@...ts.infradead.org, linux-aspeed@...ts.ozlabs.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [External] Re: [PATCH] i2c: aspeed: change debug level in irq handler

Hi Paul;

Thanks for your reply and sorry for the late reply, I was trying to
figure out why this log occurred,
 it's quite hard to reproduce.

I traced all the master and slave states, and eventually found that
the behavior matches
the description in commit b4cc1cbba519. The issue has already been
fixed in that commit
 it was caused by a state machine bug that led to the interrupt not
being handled correctly.

see: https://github.com/torvalds/linux/commit/b4cc1cbba519

(The state transitions between the master and slave here rely on interrupts.
 When the signal waveform is incomplete (such as during power off/on),
it may cause state errors or brief unresponsiveness, resulting in some
log prints.)

On Thu, Jun 19, 2025 at 7:18 AM Paul Menzel <pmenzel@...gen.mpg.de> wrote:
>
> Dear Jian,
>
>
> Thank you for the patch.
>
> Am 18.06.25 um 12:21 schrieb Jian Zhang:
> > In interrupt context, using dev_err() can potentially cause latency
> > or affect system responsiveness due to printing to console.
> >
> > In our scenario, under certain conditions, i2c1 repeatedly printed
> > "irq handled != irq. expected ..." around 20 times within 1 second.
>
> Any idea, why you hit this error at all?
>
> > Each dev_err() log introduced approximately 10ms of blocking time,
> > which delayed the handling of other interrupts — for example, i2c2.
> >
> > At the time, i2c2 was performing a PMBus firmware upgrade. The
> > target device on i2c2 was time-sensitive, and the upgrade protocol
> > was non-retryable. As a result, the delay caused by frequent error
> > logging led to a timeout and ultimately a failed firmware upgrade.
> >
> > Frequent error printing in interrupt context can be dangerous,
> > as it introduces latency and interferes with time-critical tasks.
> > This patch changes the log level from dev_err() to dev_dbg() to
> > reduce potential impact.
>
> Thank you for the patch and the problem description. Hiding an error
> condition behind debug level is also not good, as administrators might
> miss hardware issues. I do not have a solution. Is there something
> similar to WARN_ONCE? Maybe the level should be a warning instead of
> error, because the system is often able to cope with this?
Yeah, I'm a bit unsure as well. Maybe I can use dev_err_ratelimited()?

>
> The code is from 2017, so should be well tested actually, shouldn’t it?
>
> > Signed-off-by: Jian Zhang <zhangjian.3032@...edance.com>
> > ---
> >   drivers/i2c/busses/i2c-aspeed.c | 18 +++++++++---------
> >   1 file changed, 9 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> > index 1550d3d552ae..38e23c826f39 100644
> > --- a/drivers/i2c/busses/i2c-aspeed.c
> > +++ b/drivers/i2c/busses/i2c-aspeed.c
> > @@ -317,7 +317,7 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> >       switch (bus->slave_state) {
> >       case ASPEED_I2C_SLAVE_READ_REQUESTED:
> >               if (unlikely(irq_status & ASPEED_I2CD_INTR_TX_ACK))
> > -                     dev_err(bus->dev, "Unexpected ACK on read request.\n");
> > +                     dev_dbg(bus->dev, "Unexpected ACK on read request.\n");
> >               bus->slave_state = ASPEED_I2C_SLAVE_READ_PROCESSED;
> >               i2c_slave_event(slave, I2C_SLAVE_READ_REQUESTED, &value);
> >               writel(value, bus->base + ASPEED_I2C_BYTE_BUF_REG);
> > @@ -325,7 +325,7 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> >               break;
> >       case ASPEED_I2C_SLAVE_READ_PROCESSED:
> >               if (unlikely(!(irq_status & ASPEED_I2CD_INTR_TX_ACK))) {
> > -                     dev_err(bus->dev,
> > +                     dev_dbg(bus->dev,
> >                               "Expected ACK after processed read.\n");
> >                       break;
> >               }
> > @@ -354,7 +354,7 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> >               /* Slave was just started. Waiting for the next event. */;
> >               break;
> >       default:
> > -             dev_err(bus->dev, "unknown slave_state: %d\n",
> > +             dev_dbg(bus->dev, "unknown slave_state: %d\n",
> >                       bus->slave_state);
> >               bus->slave_state = ASPEED_I2C_SLAVE_INACTIVE;
> >               break;
> > @@ -459,7 +459,7 @@ static u32 aspeed_i2c_master_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> >
> >       /* We are in an invalid state; reset bus to a known state. */
> >       if (!bus->msgs) {
> > -             dev_err(bus->dev, "bus in unknown state. irq_status: 0x%x\n",
> > +             dev_dbg(bus->dev, "bus in unknown state. irq_status: 0x%x\n",
> >                       irq_status);
> >               bus->cmd_err = -EIO;
> >               if (bus->master_state != ASPEED_I2C_MASTER_STOP &&
> > @@ -523,7 +523,7 @@ static u32 aspeed_i2c_master_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> >                       irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> >                       goto error_and_stop;
> >               } else if (unlikely(!(irq_status & ASPEED_I2CD_INTR_TX_ACK))) {
> > -                     dev_err(bus->dev, "slave failed to ACK TX\n");
> > +                     dev_dbg(bus->dev, "slave failed to ACK TX\n");
> >                       goto error_and_stop;
> >               }
> >               irq_handled |= ASPEED_I2CD_INTR_TX_ACK;
> > @@ -546,7 +546,7 @@ static u32 aspeed_i2c_master_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> >               fallthrough;
> >       case ASPEED_I2C_MASTER_RX:
> >               if (unlikely(!(irq_status & ASPEED_I2CD_INTR_RX_DONE))) {
> > -                     dev_err(bus->dev, "master failed to RX\n");
> > +                     dev_dbg(bus->dev, "master failed to RX\n");
> >                       goto error_and_stop;
> >               }
> >               irq_handled |= ASPEED_I2CD_INTR_RX_DONE;
> > @@ -577,7 +577,7 @@ static u32 aspeed_i2c_master_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> >               goto out_no_complete;
> >       case ASPEED_I2C_MASTER_STOP:
> >               if (unlikely(!(irq_status & ASPEED_I2CD_INTR_NORMAL_STOP))) {
> > -                     dev_err(bus->dev,
> > +                     dev_dbg(bus->dev,
> >                               "master failed to STOP. irq_status:0x%x\n",
> >                               irq_status);
> >                       bus->cmd_err = -EIO;
> > @@ -589,7 +589,7 @@ static u32 aspeed_i2c_master_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> >               bus->master_state = ASPEED_I2C_MASTER_INACTIVE;
> >               goto out_complete;
> >       case ASPEED_I2C_MASTER_INACTIVE:
> > -             dev_err(bus->dev,
> > +             dev_dbg(bus->dev,
> >                       "master received interrupt 0x%08x, but is inactive\n",
> >                       irq_status);
> >               bus->cmd_err = -EIO;
> > @@ -665,7 +665,7 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
> >
> >       irq_remaining &= ~irq_handled;
> >       if (irq_remaining)
> > -             dev_err(bus->dev,
> > +             dev_dbg(bus->dev,
> >                       "irq handled != irq. expected 0x%08x, but was 0x%08x\n",
> >                       irq_received, irq_handled);
> >
>
>
> Kind regards,
>
> Paul

Jian.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ