[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <VI1PR0402MB3824B4D3BF37335FE48A12F1AE430@VI1PR0402MB3824.eurprd04.prod.outlook.com>
Date: Thu, 13 Aug 2020 06:03:13 +0000
From: Jason Liu <jason.hui.liu@....com>
To: Marc Zyngier <maz@...nel.org>
CC: Sudeep Holla <sudeep.holla@....com>,
"catalin.marinas@....com" <catalin.marinas@....com>,
"will@...nel.org" <will@...nel.org>,
"sashal@...nel.org" <sashal@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>
Subject: RE: [PATCH 1/1] arm64: kexec: no need to do irq_chip->irq_mask if it
already masked
> -----Original Message-----
> From: Marc Zyngier <maz@...nel.org>
> Sent: Thursday, August 6, 2020 8:26 PM
> To: Jason Liu <jason.hui.liu@....com>
> Cc: Sudeep Holla <sudeep.holla@....com>; catalin.marinas@....com;
> will@...nel.org; sashal@...nel.org; linux-kernel@...r.kernel.org;
> linux-arm-kernel@...ts.infradead.org
> Subject: Re: [PATCH 1/1] arm64: kexec: no need to do irq_chip->irq_mask if it
> already masked
>
> On 2020-08-06 11:05, Jason Liu wrote:
> >> -----Original Message-----
> >> From: Marc Zyngier <maz@...nel.org>
>
> [...]
>
> >> > No, this patch is not papering over a much deeper issue in the driver.
> >> > This is just to make things better for the ARM64 kexec.
> >>
> >> Yes, I'm sure it is... However:
> >>
> >> request_irq()
> >> <goes into suspend, panic somewhere after having turned the irqchip
> >> clock off> if (chip->irq_mask && !irqd_irq_masked(&desc->irq_data))
> >> <explodes, as the interrupt isn't masked>
> >>
> >> This is because the PM in the irqsteer driver is completely busted:
> >> request_irq() should get a reference on the driver to prevent it from
> >> being suspended. Since you don't implement it correctly, this doesn't
> >> happen and your "improvement" doesn't help at all.
> >
> > The request_irq will get a reference to prevent the irqchip from being
> > suspended due to it call irq_chip_pm_get(). I am pretty sure we have
> > implemented correctly and that is also the common Linux code.
>
> Then it seems you cannot read your own driver. At no point do you set the
> parent_device that would give you a fighting chance to get the device clocked
> and powered on. How does it work? Magic?
>
> > In order to save power and let the irqchip enter into runtime SUSPEND
> > mode, the driver will call free_irq() When it was not used(idle). Then
> > free_irq() will decrease the reference and let the irqchip enter
> > suspend state.
>
> The reference count on *what*? There is nothing to take a reference on. So yes,
> you understand how the core kernel works. But you don't seem to notice that
> there is no link between the irq and the device that implements the controller.
See the code, it will call irq_chip_pm_put(&desc->irq_data)
/*
* Internal function to unregister an irqaction - used to free
* regular and special interrupts that are part of the architecture.
*/
static struct irqaction *__free_irq(struct irq_desc *desc, void *dev_id)
{
..
irq_chip_pm_put(&desc->irq_data);
module_put(desc->owner);
kfree(action->secondary);
return action;
}
>
> > So, when the irqchip entered suspend, which means there is no user for
> > the irqchip and all the irqs were DISABLED + MASKED.
> > Due to the runtimePM support for the irqchip, when kexec runs, it will
> > sometimes meet the situation that the irqchip is suspend due to no
> > users for it. So from either the performance(time cost) or coding
> > logic, the machine_kexec_mask_interrupts() should not do double mask
> > for the irqs which already masked.
>
> I strongly suggest you start by fixing the damn driver first.
Our driver does not have issue at all. What to fix?
>
> In the meantime, NAK to this patch.
Anyway, it seems don't really understand this issue and you just simply reject one reasonable fix. Sounds not good at all.
>
> M.
> --
> Jazz is not dead. It just smells funny...
Powered by blists - more mailing lists