lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZXspNGWszB9jAown@smile.fi.intel.com>
Date:   Thu, 14 Dec 2023 18:11:32 +0200
From:   Andy Shevchenko <andy@...nel.org>
To:     Serge Semin <fancer.lancer@...il.com>
Cc:     xiongxin <xiongxin@...inos.cn>,
        Thomas Gleixner <tglx@...utronix.de>, jikos@...nel.org,
        benjamin.tissoires@...hat.com, linux-input@...r.kernel.org,
        stable@...r.kernel.org, Riwen Lu <luriwen@...inos.cn>,
        hoan@...amperecomputing.com, linus.walleij@...aro.org,
        brgl@...ev.pl, linux-gpio@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] irq: Resolve that mask_irq/unmask_irq may not be called
 in pairs

On Thu, Dec 14, 2023 at 01:06:23PM +0300, Serge Semin wrote:
> On Thu, Dec 14, 2023 at 09:54:06AM +0800, xiongxin wrote:
> > 在 2023/12/13 22:59, Thomas Gleixner 写道:
> > > On Wed, Dec 13 2023 at 10:29, xiongxin wrote:
> > > > 在 2023/12/12 23:17, Thomas Gleixner 写道:
> > > > Sorry, the previous reply may not have clarified the BUG process. I
> > > > re-debugged and confirmed it yesterday. The current BUG execution
> > > > sequence is described as follows:
> > > 
> > > It's the sequence how this works and it works correctly.
> > > 
> > > Just because it does not work on your machine it does not mean that this
> > > is incorrect and a BUG.
> > > 
> > > You are trying to fix a symptom and thereby violating guarantees of the
> > > core code.
> > > 
> > > > That is, there is a time between the 1:handle_level_irq() and
> > > > 3:irq_thread_fn() calls for the 2:disable_irq() call to acquire the lock
> > > > and then implement the irq_state_set_disabled() operation. When finally
> > > > call irq_thread_fn()->irq_finalize_oneshot(), it cannot enter the
> > > > unmask_thread_irq() process.
> > > 
> > > Correct, because the interrupt has been DISABLED in the mean time.
> > > 
> > > > In this case, the gpio irq_chip irq_mask()/irq_unmask() callback pairs
> > > > are not called in pairs, so I think this is a BUG, but not necessarily
> > > > fixed from the irq core code layer.
> > > 
> > > No. It is _NOT_ a BUG. unmask() is not allowed to be invoked when the
> > > interrupt is DISABLED. That's the last time I'm going to tell you that.
> > > Only enable_irq() can undo the effect of disable_irq(), period.
> > > 
> > > > Next, when the gpio controller driver calls the suspend/resume process,
> > > > it is as follows:
> > > > 
> > > > suspend process:
> > > > dwapb_gpio_suspend()
> > > >       ctx->int_mask   = dwapb_read(gpio, GPIO_INTMASK);
> > > > 
> > > > resume process:
> > > > dwapb_gpio_resume()
> > > >       dwapb_write(gpio, GPIO_INTMASK, ctx->int_mask);
> > > 
> > > Did you actually look at the sequence I gave you?
> > > 
> > >     Suspend:
> > > 
> > > 	  i2c_hid_core_suspend()
> > > 	     disable_irq();       <- Marks it disabled and eventually
> > > 				     masks it.
> > > 
> > > 	  gpio_irq_suspend()
> > > 	     save_registers();    <- Saves masked interrupt
> > > 
> > >     Resume:
> > > 
> > > 	  gpio_irq_resume()
> > > 	     restore_registers(); <- Restores masked interrupt
> > > 
> > > 	  i2c_hid_core_resume()
> > > 	     enable_irq();        <- Unmasks interrupt and removes the
> > > 				     disabled marker
> > > 
> > > 
> > > Have you verified that this order of invocations is what happens on
> > > your machine?
> > > 
> > > Thanks,
> > > 
> > >          tglx
> > 
> > As described earlier, in the current situation, the irq_mask() callback of
> > gpio irq_chip is called in mask_irq(), followed by the disable_irq() in
> > i2c_hid_core_suspend(), unmask_irq() will not be executed.
> > 
> > Then call enable_irq() in i2c_hid_core_resume(). Since gpio irq_chip does
> > not implement the irq_startup() callback, it ends up calling irq_enable().
> > 
> > The irq_enable() function is then implemented as follows:
> > 
> > irq_state_clr_disabled(desc);
> > if (desc->irq_data.chip->irq_enable) {
> > 	desc->irq_data.chip->irq_enable(&desc->irq_data);
> > 	irq_state_clr_masked(desc);
> > } else {
> > 	unmask_irq(desc);
> > }
> > 
> > Because gpio irq_chip implements irq_enable(), unmask_irq() is not executed,
> > and gpio irq_chip's irq_unmask() callback is not called. Instead,
> > irq_state_clr_masked() was called to clear the masked flag.
> > 
> > The irq masked behavior is actually controlled by the
> > irq_mask()/irq_unmask() callback function pairs in gpio irq_chip. When the
> > whole situation occurs, there is one more irq_mask() operation, or one less
> > irq_unmask() operation. This ends the i2c hid resume and the gpio
> > corresponding i2c hid interrupt is also masked.
> > 
> > Please help confirm whether the current situation is a BUG, or suggest other
> > solutions to fix it.
> 
> I has just been Cc'ed to this thread from the very start of the thread
> so not sure whether I fully understand the problem. But a while ago an
> IRQ disable/undisable-mask/unmask-unpairing problem was reported for
> DW APB GPIO driver implementation, but in a another context though:
> https://lore.kernel.org/linux-gpio/1606728979-44259-1-git-send-email-luojiaxing@huawei.com/
> We didn't come to a final conclusion back then, so the patch got lost
> in the emails archive. Xiong, is it related to the problem in your
> case?

>From what explained it sounds to me that GPIO driver has missing part and
this seems the missing patch (in one or another form). Perhaps we can get
a second round of review,

-- 
With Best Regards,
Andy Shevchenko


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ