lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 13 Aug 2020 11:08:54 +0100
From:   Marc Zyngier <maz@...nel.org>
To:     Jason Liu <jason.hui.liu@....com>
Cc:     Sudeep Holla <sudeep.holla@....com>, catalin.marinas@....com,
        will@...nel.org, sashal@...nel.org, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 1/1] arm64: kexec: no need to do irq_chip->irq_mask if it
 already masked

On 2020-08-13 07:03, Jason Liu wrote:
>> -----Original Message-----
>> From: Marc Zyngier <maz@...nel.org>
>> Sent: Thursday, August 6, 2020 8:26 PM
>> To: Jason Liu <jason.hui.liu@....com>
>> Cc: Sudeep Holla <sudeep.holla@....com>; catalin.marinas@....com;
>> will@...nel.org; sashal@...nel.org; linux-kernel@...r.kernel.org;
>> linux-arm-kernel@...ts.infradead.org
>> Subject: Re: [PATCH 1/1] arm64: kexec: no need to do 
>> irq_chip->irq_mask if it
>> already masked
>> 
>> On 2020-08-06 11:05, Jason Liu wrote:
>> >> -----Original Message-----
>> >> From: Marc Zyngier <maz@...nel.org>
>> 
>> [...]
>> 
>> >> > No, this patch is not papering over a much deeper issue in the driver.
>> >> > This is just to make things better for the ARM64 kexec.
>> >>
>> >> Yes, I'm sure it is... However:
>> >>
>> >> request_irq()
>> >> <goes into suspend, panic somewhere after having turned the irqchip
>> >> clock off> if (chip->irq_mask && !irqd_irq_masked(&desc->irq_data))
>> >>      <explodes, as the interrupt isn't masked>
>> >>
>> >> This is because the PM in the irqsteer driver is completely busted:
>> >> request_irq() should get a reference on the driver to prevent it from
>> >> being suspended. Since you don't implement it correctly, this doesn't
>> >> happen and your "improvement" doesn't help at all.
>> >
>> > The request_irq will get a reference to prevent the irqchip from being
>> > suspended due to it call irq_chip_pm_get(). I am pretty sure we have
>> > implemented correctly and that is also the common Linux code.
>> 
>> Then it seems you cannot read your own driver. At no point do you set 
>> the
>> parent_device that would give you a fighting chance to get the device 
>> clocked
>> and powered on. How does it work? Magic?
>> 
>> > In order to save power and let the irqchip enter into runtime SUSPEND
>> > mode, the driver will call free_irq() When it was not used(idle). Then
>> > free_irq() will decrease the reference and let the irqchip enter
>> > suspend state.
>> 
>> The reference count on *what*? There is nothing to take a reference 
>> on. So yes,
>> you understand how the core kernel works. But you don't seem to notice 
>> that
>> there is no link between the irq and the device that implements the 
>> controller.
> 
> See the code, it will call irq_chip_pm_put(&desc->irq_data)
> 
> /*
>  * Internal function to unregister an irqaction - used to free
>  * regular and special interrupts that are part of the architecture.
>  */
> static struct irqaction *__free_irq(struct irq_desc *desc, void 
> *dev_id)
> {
> ..
>         irq_chip_pm_put(&desc->irq_data);
>         module_put(desc->owner);
>         kfree(action->secondary);
>         return action;
> }

This is getting tiresome. You want to play the code-quote game?

int irq_chip_pm_put(struct irq_data *data)
{
	int retval = 0;

	if (IS_ENABLED(CONFIG_PM) && data->chip->parent_device)
		retval = pm_runtime_put(data->chip->parent_device);

	return (retval < 0) ? retval : 0;
}

What is parent_device set to in your driver? Oh wait... Nothing.
So what does the code you quoted do? Not much.

>> > So, when the irqchip entered suspend, which means there is no user for
>> > the irqchip and all the irqs were DISABLED + MASKED.
>> > Due to the runtimePM support for the irqchip, when kexec runs, it will
>> > sometimes meet the situation that the irqchip is suspend due to no
>> > users for it. So from either the performance(time cost) or coding
>> > logic, the machine_kexec_mask_interrupts() should not do double mask
>> > for the irqs which already masked.
>> 
>> I strongly suggest you start by fixing the damn driver first.
> 
> Our driver does not have issue at all. What to fix?

[I've censored myself here...]

> 
>> 
>> In the meantime, NAK to this patch.
> 
> Anyway, it seems don't really understand this issue and you just
> simply reject one reasonable fix. Sounds not good at all.

I reject it because your approach is flawed, and that you are
papering over a glaring bug in your driver that you are refusing
to fix.

Maybe the right thing to do is to remove this driver from the code
base altogether. I will prepare a patch to that effect.

         M.
-- 
Jazz is not dead. It just smells funny...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ