lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6affbb35-7b79-db6e-a346-e74d2ba2e886@canonical.com>
Date:   Tue, 14 Sep 2021 09:21:15 +0200
From:   Krzysztof Kozlowski <krzysztof.kozlowski@...onical.com>
To:     Sebastian Krzyszkowiak <sebastian.krzyszkowiak@...i.sm>,
        Sebastian Reichel <sre@...nel.org>, linux-pm@...r.kernel.org
Cc:     linux-kernel@...r.kernel.org,
        Anton Vorontsov <anton.vorontsov@...aro.org>,
        Ramakrishna Pallala <ramakrishna.pallala@...el.com>,
        Dirk Brandewie <dirk.brandewie@...il.com>,
        stable@...r.kernel.org, kernel@...i.sm
Subject: Re: [PATCH 1/2] power: supply: max17042_battery: Clear status bits in
 interrupt handler

On 13/09/2021 20:32, Sebastian Krzyszkowiak wrote:
> On poniedziałek, 13 września 2021 15:02:34 CEST Krzysztof Kozlowski wrote:
>> On 12/09/2021 22:54, Sebastian Krzyszkowiak wrote:
>>> The gauge requires us to clear the status bits manually for some alerts
>>> to be properly dismissed. Previously the IRQ was configured to react only
>>> on falling edge, which wasn't technically correct (the ALRT line is active
>>> low), but it had a happy side-effect of preventing interrupt storms
>>> on uncleared alerts from happening.
>>>
>>> Fixes: 7fbf6b731bca ("power: supply: max17042: Do not enforce (incorrect)
>>> interrupt trigger type") Cc: <stable@...r.kernel.org>
>>> Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@...i.sm>
>>> ---
>>>
>>>  drivers/power/supply/max17042_battery.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/power/supply/max17042_battery.c
>>> b/drivers/power/supply/max17042_battery.c index
>>> 8dffae76b6a3..c53980c8432a 100644
>>> --- a/drivers/power/supply/max17042_battery.c
>>> +++ b/drivers/power/supply/max17042_battery.c
>>> @@ -876,6 +876,9 @@ static irqreturn_t max17042_thread_handler(int id,
>>> void *dev)> 
>>>  		max17042_set_soc_threshold(chip, 1);
>>>  	
>>>  	}
>>>
>>> +	regmap_clear_bits(chip->regmap, MAX17042_STATUS,
>>> +			  0xFFFF & ~(STATUS_POR_BIT | 
> STATUS_BST_BIT));
>>> +
>>
>> Are you sure that this was the reason of interrupt storm? Not incorrect
>> SoC value (read from register for ModelGauge m3 while not configuring
>> fuel gauge model).
> 
> Yes, I am sure. I have observed this on a fully configured max17055 with 
> ModelGauge m5. It also makes sense to me based on what I read in the code and 
> datasheets.
> 
> There were two kinds of storms - the short ones happening on each SOC change 
> caused by SOC threshold alerts set by max17042_set_soc_threshold which 
> eventually got cleared by reconfiguring the thresholds; and a huge one 
> happening when SOC got down to 0% that did not get away until the battery got 
> charged to at least 1% at which point the thresholds got reconfigured again 
> (which is how I noticed the underflow fixed by the second patch).

OK, undestood.

> 
> Besides, I also have patches for configuring m5 gauge via DT that I'll send 
> once I clean them up.

That's cool! Happy to see such work.

> 
>> You should only clear bits which you are awaken for... Have in mind that
>> in DT-configuration the fuel gauge is most likely broken by missing
>> configuration. With alert enabled, several other config fields should be
>> cleared.
> 
> I have checked all the bits in the Status register and aside of Bst, POR and 
> bunch of "don't-care" bits they're all alert indicators that we either handle 
> explicitly in the interrupt handler (Smn/Smx) or implicitly via 
> power_supply_changed (Imn/Imx, Vmn/Vmx, Tmn/Tmx, dSOCi, Bi/Br). The driver 
> unconditionally enables alerts for SOC thresholds and all the rest stays 
> effectively disabled at POR; however, a bootloader or firmware may configure it 
> differently, which may be wanted for things like resuming from suspend when a 
> bad condition happens. Therefore we need to clear all the bits anyway and I'm 
> not sure whether iterating through them in a "if set then clear" loop gains us 
> anything aside of additional lines of code.

Seems reasonable, you're right. Could you mention this expolanation in
commit msg or comment in the code?


Best regards,
Krzysztof

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ