[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <mhng-cd9af4c1-3eb8-4053-8d59-e59c12c14166@palmer-ri-x1c9a>
Date: Tue, 25 Jun 2024 08:36:15 -0700 (PDT)
From: Palmer Dabbelt <palmer@...belt.com>
To: takakura@...inux.co.jp
CC: alex@...ti.fr, songshuaishuai@...ylab.org,
Paul Walmsley <paul.walmsley@...ive.com>, aou@...s.berkeley.edu, guoren@...nel.org, xianting.tian@...ux.alibaba.com,
takahiro.akashi@...aro.org, takakura@...inux.co.jp, linux-riscv@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [-fixes] riscv: kexec: Avoid deadlock in kexec crash path
On Sun, 09 Jun 2024 07:18:02 PDT (-0700), takakura@...inux.co.jp wrote:
> Hi Alex, Song,
>
> On Fri, 24 May 2024, Alexandre Ghiti wrote:
>>Hi Song, Ryo,
>>
>>On 06/05/2024 07:10, takakura@...inux.co.jp wrote:
>>> Hi Song and Paul!
>>>
>>>>> To avoid the deadlock, this patch directly EOI the irq regardless of
>>>>> the active status of irqchip.
>>>> Taking a quick look at the other architectures, looks like no one else is
>>>> doing this. Is this addressing a RISC-V-only problem?
>>>>
>>>>> diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c
>>>>> index f6c7135b00d7..d7ddf4d2b243 100644
>>>>> --- a/arch/riscv/kernel/machine_kexec.c
>>>>> +++ b/arch/riscv/kernel/machine_kexec.c
>>>>> @@ -149,20 +149,12 @@ static void machine_kexec_mask_interrupts(void)
>>>>>
>>>>> for_each_irq_desc(i, desc) {
>>>>> struct irq_chip *chip;
>>>>> - int ret;
>>>>>
>>>>> chip = irq_desc_get_chip(desc);
>>>>> if (!chip)
>>>>> continue;
>>>>>
>>>>> - /*
>>>>> - * First try to remove the active state. If this
>>>>> - * fails, try to EOI the interrupt.
>>>>> - */
>>>>> - ret = irq_set_irqchip_state(i, IRQCHIP_STATE_ACTIVE, false);
>>>>> -
>>>>> - if (ret && irqd_irq_inprogress(&desc->irq_data) &&
>>>>> - chip->irq_eoi)
>>>>> + if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data))
>>>>> chip->irq_eoi(&desc->irq_data);
>>> I think this deadlock is relevant to riscv and arm64 as they both
>>> acquire irqdesc spinlock by calling irq_set_irqchip_state() during their
>>> machine_kexec_mask_interrupts().
>>>
>>> However, I think calling irq_set_irqchip_state() during
>>> machine_kexec_mask_interrupts() is arm64 specific way of handling EOI
>>> which is not necessary for riscv.
>>> For arm64, its interrupt controller(gic) seems to have two ways of EOIing
>>> an interrupt depending on the mode which gic is configured. One of them
>>> treats EOI as two step procedure, priority drop and deactivation. I think
>>> irq_set_irqchip_state() is there to handle the deactivation part of
>>> the procedure.
>>> For riscv, EOI only requires irq_eoi handler to complete EOI and I think
>>> keeping irq_set_irqchip_state() will only leave this possible deadlock
>>> without any use.
>>> So I think it's best we simply remove irq_set_irqchip_state() as Song did.
>>
>>
>>I think this ^ is relevant and should be added to the commit log. @Song
>>can you respin another version with the updated commit log? @Ryo can you
>>add your Reviewed-by when it's done?
>
> Sure!
Just checking up on this one, I don't see a v2 on the lists.
>
>>This fix has been lagging behind for quite some time, it would be nice
>>to merge this in 6.10 and backport to stable.
>
> Sincerely,
>
> Ryo Takakura
>
>>Thanks,
>>
>>Alex
Powered by blists - more mailing lists