lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240924190201.xvxf5ugpmrveyo5r@pali>
Date: Tue, 24 Sep 2024 21:02:01 +0200
From: Pali Rohár <pali@...nel.org>
To: Jerry Lv <Jerry.Lv@...s.com>
Cc: Sebastian Reichel <sre@...nel.org>,
	"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Kernel <Kernel@...s.com>
Subject: Re: [PATCH] power: supply: bq27xxx_battery: Do not return ENODEV
 when busy

Hello, as I do not have HW which is affected by this issue, I think that
you would better know how to handle it. If you think that one retry is
enough for normal usage then go ahead with it. I'm fine with it.

Maybe if we want to be super precise we can measure probability how
often gauge is busy and then calculate number of retries to have device
driver working in usual conditions over one or two years. But this is
overkill...

On Tuesday 24 September 2024 03:34:11 Jerry Lv wrote:
> Hi Pali,
> 
> Just as you mentioned, when the gauge is busy, the other devices
> connected to the same I2C will not response too. We rarely see
> this in the normal use case, but sometimes see it in our stress test.
> 
> Since the gauge usually recovers from busy status very quickly, and
> too many retry may affect other devices too. So could we just retry
> one time, do you think is it enough?
> 
> Best Regards
> Jerry Lv
> 
> ________________________________________
> From: Pali Rohár <pali@...nel.org>
> Sent: Tuesday, September 24, 2024 2:16 AM
> To: Jerry Lv
> Cc: Sebastian Reichel; linux-pm@...r.kernel.org; linux-kernel@...r.kernel.org; Kernel
> Subject: Re: [PATCH] power: supply: bq27xxx_battery: Do not return ENODEV when busy
> 
> Thank you for detailed information about i2c NAK. In this case try to
> consider if it would not be better to add retry logic in the
> bq27xxx_battery_i2c_read() function.
> 
> If it is common that bq chipset itself returns i2c NAKs during normal
> operations then this affects any i2c read operation done by
> bq27xxx_battery_i2c_read() function.
> 
> So this issue is not related just to reading "flags", but to anything.
> That is why I think that retry should be handled at lower layer.
> 
> On Monday 23 September 2024 08:14:13 Jerry Lv wrote:
> > Hi Pali,
> >
> > Thanks for your excellent suggestion, I will change the code accordingly.
> >
> > About the question:
> > Anyway, which bus is BQ27Z561-R2 using (i2c?)? And how is EBUSY indicated or transferred over wire?
> > --- Yes, we connect the gauge BQ27Z561 to I2C. When it's busy, the feedback we got from the logic analyser is "NAK".
> >
> >
> > Best Regards,
> > Jerry Lv
> >
> > ________________________________________
> > From: Pali Rohár <pali@...nel.org>
> > Sent: Saturday, September 14, 2024 4:24 PM
> > To: Jerry Lv
> > Cc: Sebastian Reichel; linux-pm@...r.kernel.org; linux-kernel@...r.kernel.org; Kernel
> > Subject: Re: [PATCH] power: supply: bq27xxx_battery: Do not return ENODEV when busy
> >
> > Hello Jerry,
> >
> > I think that this issue should be handled in different way.
> >
> > First thing is to propagate error and not change it to -ENODEV. This is
> > really confusing and makes debugging harder.
> >
> > Second thing, if bq27xxx_read() returns -EBUSY, sleep few milliseconds
> > and call bq27xxx_read() again.
> >
> > This should cover the issue which you are observing and also fixing the
> > problem which you introduced in your change (interpreting error code as
> > bogus cache data).
> >
> > Anyway, which bus is BQ27Z561-R2 using (i2c?)? And how is EBUSY
> > indicated or transferred over wire?
> >
> > Pali
> >
> > On Saturday 14 September 2024 02:57:39 Jerry Lv wrote:
> > > Hi Pali,
> > >
> > > (Sorry for inconvineient! previous email was rejected by some email list for some HTML part, so I edit it and send it again.)
> > >
> > > Yes, bq27xxx_read() will return -EBUSY, and bq27xxx_read() will be called in many functions.
> > >
> > > In our product, some different applications may access the gauge BQ27Z561-R2, and we see many times the returned error code is -ENODEV.
> > > After debugging it by oscillograph and adding some debug info, we found the device is busy sometimes, and it will recover very soon(a few milliseconds).
> > > So, we want to exclude the busy case before return -ENODEV.
> > >
> > > Best Regards,
> > > Jerry
> > >
> > > On Friday 13 September 2024 16:45:37 Jerry Lv wrote:
> > > > Multiple applications may access the device gauge at the same time, so the
> > > > gauge may be busy and EBUSY will be returned. The driver will set a flag to
> > > > record the EBUSY state, and this flag will be kept until the next periodic
> > > > update. When this flag is set, bq27xxx_battery_get_property() will just
> > > > return ENODEV until the flag is updated.
> > >
> > > I did not find any evidence of EBUSY. Which function and to which caller
> > > it returns? Do you mean that bq27xxx_read() returns -EBUSY?
> > >
> > > > Even if the gauge was busy during the last accessing attempt, returning
> > > > ENODEV is not ideal, and can cause confusion in the applications layer.
> > >
> > > It would be better to either propagate correct error or return old value
> > > from cache...
> > >
> > > > Instead, retry accessing the gauge to update the properties is as expected.
> > > > The gauge typically recovers from busy state within a few milliseconds, and
> > > > the cached flag will not cause issues while updating the properties.
> > > >
> > > > Signed-off-by: Jerry Lv <Jerry.Lv@...s.com>
> > > > ---
> > > >  drivers/power/supply/bq27xxx_battery.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/power/supply/bq27xxx_battery.c b/drivers/power/supply/bq27xxx_battery.c
> > > > index 750fda543308..eefbb5029a3b 100644
> > > > --- a/drivers/power/supply/bq27xxx_battery.c
> > > > +++ b/drivers/power/supply/bq27xxx_battery.c
> > > > @@ -2029,7 +2029,7 @@ static int bq27xxx_battery_get_property(struct power_supply *psy,
> > > >                bq27xxx_battery_update_unlocked(di);
> > > >        mutex_unlock(&di->lock);
> > > >
> > > > -     if (psp != POWER_SUPPLY_PROP_PRESENT && di->cache.flags < 0)
> > > > +     if (psp != POWER_SUPPLY_PROP_PRESENT && di->cache.flags < 0 && di->cache.flags != -EBUSY)
> > > >                return -ENODEV;
> > >
> > > ... but ignoring error and re-using the error return value as flags in
> > > code later in this function is bad idea.
> > >
> > > >
> > > >        switch (psp) {
> > > >
> > > > ---
> > > > base-commit: da3ea35007d0af457a0afc87e84fddaebc4e0b63
> > > > change-id: 20240913-foo-fix2-a0d79db86a0b
> > > >
> > > > Best regards,
> > > > --
> > > > Jerry Lv <Jerry.Lv@...s.com>
> > > >
> > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ