[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZwTK5Jip2YJrSd8L@f642ec5a18a7>
Date: Tue, 8 Oct 2024 09:02:12 +0300
From: Ian Ray <ian.ray@...ealthcare.com>
To: Jean Delvare <jdelvare@...e.de>
Cc: Linus Walleij <linus.walleij@...aro.org>,
Bartosz Golaszewski <brgl@...ev.pl>, linux-gpio@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] gpio: pca953x: fix pca953x_irq_bus_sync_unlock race
On Mon, Oct 07, 2024 at 11:16:51PM +0200, Jean Delvare wrote:
> Hi Ray,
>
> On Thu, 2024-06-20 at 07:29 +0300, Ian Ray wrote:
> > Ensure that `i2c_lock' is held when setting interrupt latch and mask in
> > pca953x_irq_bus_sync_unlock() in order to avoid races.
> >
> > The other (non-probe) call site pca953x_gpio_set_multiple() ensures the
> > lock is held before calling pca953x_write_regs().
> >
> > The problem occurred when a request raced against irq_bus_sync_unlock()
> > approximately once per thousand reboots on an i.MX8MP based system.
> >
> > * Normal case
> >
> > 0-0022: write register AI|3a {03,02,00,00,01} Input latch P0
> > 0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0
> > 0-0022: write register AI|08 {ff,00,00,00,00} Output P3
> > 0-0022: write register AI|12 {fc,00,00,00,00} Config P3
> >
> > * Race case
> >
> > 0-0022: write register AI|08 {ff,00,00,00,00} Output P3
> > 0-0022: write register AI|08 {03,02,00,00,01} *** Wrong register ***
> > 0-0022: write register AI|12 {fc,00,00,00,00} Config P3
> > 0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0
> >
>
> I have more questions on this. Where does the above log come from?
> Specifically, at which layer (bus driver, regmap, gpio device drier)?
Additional debug, with manually added commentary (sorry for not being
clearer). The debug was added to drivers/base/regmap/regmap-i2c.c while
investigating the issue.
> What do these values represent exactly? Which GPIO chip was used on
> your system? Which i2c bus driver is being used on that system? What
> are the "requests" you mention in the description above?
GPIO expander pi4ioe5v6534q at I2C address 0-0022.
# grep . {name,uevent}
name:30a20000.i2c
uevent:OF_NAME=i2c
uevent:OF_FULLNAME=/soc@...us@...00000/i2c@...20000
uevent:OF_COMPATIBLE_0=fsl,imx8mp-i2c
uevent:OF_COMPATIBLE_1=fsl,imx21-i2c
uevent:OF_COMPATIBLE_N=2
uevent:OF_ALIAS_0=i2c0
>
> I'm asking because I do not understand how writing to the wrong
> register can happen, even without holding i2c_lock in
> pca953x_irq_bus_sync_unlock(). The i2c layer has a per-i2c_adapter lock
Given that pca953x_irq_bus_sync_unlock is part of an interrupt handler,
IMHO this explains very well why locking is needed (but I did not dig
deeper than that).
> which is taken before any bus transfer, so it isn't possible that two
> transfers collide at the bus level. So the lack of locking at the
> device driver level could lead to data corruption (for example read-
> modify-write cycles overlapping), but not to data being written to the
> wrong register.
Based on the observed data, the hypothesis was that pca953x_write_regs
(called via pca953x_gpio_set_multiple) and pca953x_irq_bus_sync_unlock
can race.
The missing guard neatly explained and fixed the issue (disclaimer: on
my hardware for my scenario).
>
> As a side note, I dug through the history of the gpio-pca953x driver
> and found that i2c_lock was introduced before the driver was converted
> to regmap by:
>
> commit 6e20fb18054c179d7e64c0af43d855b9310a3394
> Author: Roland Stigge
> Date: Thu Feb 10 15:01:23 2011 -0800
>
> drivers/gpio/pca953x.c: add a mutex to fix race condition
>
> The fix added locking around read-modify-write cycles (which was indeed
> needed) and also around simple register reads (which I don't think was
> needed).
>
> It turns out that regmap has its own protection around read-modify-
> write cycles (see regmap_update_bits_base) so I think several uses of
> i2c_lock should have been removed from the gpio-pca953x driver when it
> was converted to regmap as they became redundant then. This driver-side
> lock is still needed in a number of functions though, where the read-
> modify-write is handled outside of regmap (for example in
> pca953x_gpio_set_multiple).
>
Blue skies,
Ian
> Thanks,
> --
> Jean Delvare
> SUSE L3 Support
Powered by blists - more mailing lists