lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Dec 2023 09:54:06 +0800
From:   xiongxin <xiongxin@...inos.cn>
To:     Thomas Gleixner <tglx@...utronix.de>, jikos@...nel.org,
        benjamin.tissoires@...hat.com
Cc:     linux-input@...r.kernel.org, stable@...r.kernel.org,
        Riwen Lu <luriwen@...inos.cn>, hoan@...amperecomputing.com,
        fancer.lancer@...il.com, linus.walleij@...aro.org, brgl@...ev.pl,
        andy@...nel.org, linux-gpio@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] irq: Resolve that mask_irq/unmask_irq may not be called
 in pairs

在 2023/12/13 22:59, Thomas Gleixner 写道:
> On Wed, Dec 13 2023 at 10:29, xiongxin wrote:
>> 在 2023/12/12 23:17, Thomas Gleixner 写道:
>> Sorry, the previous reply may not have clarified the BUG process. I
>> re-debugged and confirmed it yesterday. The current BUG execution
>> sequence is described as follows:
> 
> It's the sequence how this works and it works correctly.
> 
> Just because it does not work on your machine it does not mean that this
> is incorrect and a BUG.
> 
> You are trying to fix a symptom and thereby violating guarantees of the
> core code.
> 
>> That is, there is a time between the 1:handle_level_irq() and
>> 3:irq_thread_fn() calls for the 2:disable_irq() call to acquire the lock
>> and then implement the irq_state_set_disabled() operation. When finally
>> call irq_thread_fn()->irq_finalize_oneshot(), it cannot enter the
>> unmask_thread_irq() process.
> 
> Correct, because the interrupt has been DISABLED in the mean time.
> 
>> In this case, the gpio irq_chip irq_mask()/irq_unmask() callback pairs
>> are not called in pairs, so I think this is a BUG, but not necessarily
>> fixed from the irq core code layer.
> 
> No. It is _NOT_ a BUG. unmask() is not allowed to be invoked when the
> interrupt is DISABLED. That's the last time I'm going to tell you that.
> Only enable_irq() can undo the effect of disable_irq(), period.
> 
>> Next, when the gpio controller driver calls the suspend/resume process,
>> it is as follows:
>>
>> suspend process:
>> dwapb_gpio_suspend()
>>       ctx->int_mask   = dwapb_read(gpio, GPIO_INTMASK);
>>
>> resume process:
>> dwapb_gpio_resume()
>>       dwapb_write(gpio, GPIO_INTMASK, ctx->int_mask);
> 
> Did you actually look at the sequence I gave you?
> 
>     Suspend:
> 
> 	  i2c_hid_core_suspend()
> 	     disable_irq();       <- Marks it disabled and eventually
> 				     masks it.
> 
> 	  gpio_irq_suspend()
> 	     save_registers();    <- Saves masked interrupt
> 
>     Resume:
> 
> 	  gpio_irq_resume()
> 	     restore_registers(); <- Restores masked interrupt
> 
> 	  i2c_hid_core_resume()
> 	     enable_irq();        <- Unmasks interrupt and removes the
> 				     disabled marker
> 
> 
> Have you verified that this order of invocations is what happens on
> your machine?
> 
> Thanks,
> 
>          tglx

As described earlier, in the current situation, the irq_mask() callback 
of gpio irq_chip is called in mask_irq(), followed by the disable_irq() 
in i2c_hid_core_suspend(), unmask_irq() will not be executed.

Then call enable_irq() in i2c_hid_core_resume(). Since gpio irq_chip 
does not implement the irq_startup() callback, it ends up calling 
irq_enable().

The irq_enable() function is then implemented as follows:

irq_state_clr_disabled(desc);
if (desc->irq_data.chip->irq_enable) {
	desc->irq_data.chip->irq_enable(&desc->irq_data);
	irq_state_clr_masked(desc);
} else {
	unmask_irq(desc);
}

Because gpio irq_chip implements irq_enable(), unmask_irq() is not 
executed, and gpio irq_chip's irq_unmask() callback is not called. 
Instead, irq_state_clr_masked() was called to clear the masked flag.

The irq masked behavior is actually controlled by the 
irq_mask()/irq_unmask() callback function pairs in gpio irq_chip. When 
the whole situation occurs, there is one more irq_mask() operation, or 
one less irq_unmask() operation. This ends the i2c hid resume and the 
gpio corresponding i2c hid interrupt is also masked.

Please help confirm whether the current situation is a BUG, or suggest 
other solutions to fix it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ