lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dc8de6db-b2d9-4cd1-9ee0-646045199c9b@yoseli.org>
Date: Tue, 18 Feb 2025 14:17:45 +0100
From: Jean-Michel Hautbois <jeanmichel.hautbois@...eli.org>
To: Greg Ungerer <gerg@...ux-m68k.org>,
 Geert Uytterhoeven <geert@...ux-m68k.org>
Cc: linux-m68k@...ts.linux-m68k.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] m68k: coldfire: Prevent spurious interrupts when masking
 IMR

Hi Greg,

On 2/12/25 1:47 PM, Greg Ungerer wrote:
> Hi JM,
> 
> On 5/2/25 21:26, Jean-Michel Hautbois wrote:
>> Hi Geert,
>>
>> On 05/02/2025 09:14, Geert Uytterhoeven wrote:
>>> Hi Jean-Michel,
>>>
>>> On Wed, 5 Feb 2025 at 08:07, Jean-Michel Hautbois
>>> <jeanmichel.hautbois@...eli.org> wrote:
>>>> On 04/02/2025 20:27, Geert Uytterhoeven wrote:
>>>>> On Tue, 4 Feb 2025 at 19:38, Jean-Michel Hautbois
>>>>> <jeanmichel.hautbois@...eli.org> wrote:
>>>>>> The ColdFire interrupt controller can generate spurious interrupts 
>>>>>> if an
>>>>>> interrupt source is masked in the IMR while the CPU interrupt 
>>>>>> priority
>>>>>> mask (SR[I]) is set lower than the interrupt level.
>>>>>>
>>>>>> The reference manual states:
>>>>>>
>>>>>> To avoid this situation for interrupts sources with levels 1-6, first
>>>>>> write a higher level interrupt mask to the status register, before
>>>>>> setting the mask in the IMR or the module’s interrupt mask register.
>>>>>> After the mask is set, return the interrupt mask in the status 
>>>>>> register
>>>>>> to its previous value.
>>>>>>
>>>>>> It can be tested like this:
>>>>>> - Prepare a iperf3 server on the coldfire target (iperf3 -s -D)
>>>>>> - Start a high priority cyclictest:
>>>>>>       cyclictest --secaligned -m -p 99 -i 2500 -q
>>>>>> - Start iperf3 -c $COLDFIRE_IP -t 0
>>>>>>
>>>>>> After a few seconds the dmesg may display:
>>>>>> [   84.784301] irq 24, desc: dbc502da, depth: 1, count: 0, 
>>>>>> unhandled: 0
>>>>>> [   84.784455] ->handle_irq():  0ba0aca3, handle_bad_irq+0x0/0x1e0
>>>>>> [   84.784610] ->irq_data.chip(): c6779d4f, 0x41652544
>>>>>> [   84.784719] ->action(): 00000000
>>>>>> [   84.784770] unexpected IRQ trap at vector 18
>>>>>>
>>>>>> With this patch, I never saw it in a few hours testing.
>>>>>>
>>>>>> Signed-off-by: Jean-Michel Hautbois <jeanmichel.hautbois@...eli.org>
>>>>>
>>>>> Thanks for your patch!
>>>>>
>>>>>> --- a/arch/m68k/coldfire/intc-simr.c
>>>>>> +++ b/arch/m68k/coldfire/intc-simr.c
>>>>>> @@ -58,6 +58,14 @@ static inline unsigned int irq2ebit(unsigned 
>>>>>> int irq)
>>>>>>
>>>>>>    #endif
>>>>>>
>>>>>> +static inline void intc_irq_setlevel(unsigned long level)
>>>>>> +{
>>>>>> +       asm volatile ("move.w %0,%%sr"
>>>>>> +                     : /* no outputs */
>>>>>> +                     : "d" (0x2000 | ((level) << 8))
>>>>>> +                     : "memory");
>>>>>> +}
>>>>>> +
>>>>>>    /*
>>>>>>     *     There maybe one, two or three interrupt control units, 
>>>>>> each has 64
>>>>>>     *     interrupts. If there is no second or third unit then 
>>>>>> MCFINTC1_* or
>>>>>> @@ -67,13 +75,17 @@ static inline unsigned int irq2ebit(unsigned 
>>>>>> int irq)
>>>>>>    static void intc_irq_mask(struct irq_data *d)
>>>>>>    {
>>>>>>           unsigned int irq = d->irq - MCFINT_VECBASE;
>>>>>> +       unsigned long flags = arch_local_save_flags();
>>>>>>
>>>>>> +       intc_irq_setlevel(7);
>>>>>
>>>>> Can't all of the above just be replaced by
>>>>>
>>>>>       unsigned long flags = arch_local_irq_save();
>>>>
>>>> The only change is the Supervisor bit in SR which is not changed in
>>>> arch_local_irq_disable() while it is forced to 1 in my function 
>>>> (setting
>>>> it to 0x2700 AFAICT).
> 
> I would expect that it will always be set here - since we must be running
> in kernel mode to be executing this code.
> 
> 
>>>> But I can confirm I couldn't see the issue with this code, while using
>>>> the existing arch_local_irq_save() it still appears (less frequently
>>>> than without it at all, but still).
>>>>
>>>> Any suggestion :-) ?
>>>
>>> There are other differences: your version clears all other bits, incl.
>>> condition codes and master/interrupt state.
> 
> Clearing of the interrupt mask seems like it might be an important
> difference here. I don't see any of the CCR bits having an effect here.
> 
> It is surprising that the existing arch_local_irq_disable() code doesn't
> satisfy the Reference Manual description of the spurious interrupt
> problem. It is exactly raising the IRQ level to 7.
> 
> 
>>> Can you save the flags above in a global, and print it in the
>>> unexpected IRQ handler, to see which other bits are set when
>>> it happens?
>>
>> An interesting side effect is... that only saving the flags makes it 
>> *a lot* harder to reproduce -_-.
>> Which is coherent with a race condition though :p.
>>
>> Each time I got the message, the flags saved where 0x2711.
> 
> Couple of further suggestions.
> 
> It might be worth putting an actual comment in the code about the issue.
> It will probably not be obvious in the future why this is needed here.
> Just something brief about stopping spurious interrupts should be good 
> enough.

Thanks for the modifications, it is indeed a better naming, and the 
comment might help future reviewers :-).

FWIW, I added the same in the intc_irq_unmask() because I think it might 
also occurs in this path. Dunno if it is needed or not (it is very hard 
to reproduce :-/).

Do you need a v2 with the patch you did below ?
JM

> 
> With a couple of tweaks to the code I could get tighter asm code generated.
> I dunno, maybe it is not worth it.
> 
> Regards
> Greg
> 
> 
> 
> 
> diff --git a/arch/m68k/coldfire/intc-simr.c b/arch/m68k/coldfire/intc- 
> simr.c
> index f7c2c41b3156..11deeb6f1048 100644
> --- a/arch/m68k/coldfire/intc-simr.c
> +++ b/arch/m68k/coldfire/intc-simr.c
> @@ -58,6 +58,20 @@ static inline unsigned int irq2ebit(unsigned int irq)
> 
>   #endif
> 
> +/*
> + * Avoid spurious interrupts by raising level before modifying mask.
> + */
> +static inline unsigned long intc_irq_save_and_mask(void)
> +{
> +       unsigned long flags;
> +       asm volatile ("move.w %%sr,%0\n\t"
> +                     "move.w %1,%%sr"
> +                     : "=&d" (flags)
> +                     : "d" (0x2700)
> +                     : "memory");
> +       return flags;
> +}
> +
>   /*
>    *     There maybe one, two or three interrupt control units, each has 64
>    *     interrupts. If there is no second or third unit then MCFINTC1_* or
> @@ -66,14 +80,20 @@ static inline unsigned int irq2ebit(unsigned int irq)
> 
>   static void intc_irq_mask(struct irq_data *d)
>   {
> -       unsigned int irq = d->irq - MCFINT_VECBASE;
> +       unsigned long flags;
> +       unsigned int irq;
> 
> +       flags = intc_irq_save_and_mask();
> +
> +       irq = d->irq - MCFINT_VECBASE;
>          if (MCFINTC2_SIMR && (irq > 127))
>                  __raw_writeb(irq - 128, MCFINTC2_SIMR);
>          else if (MCFINTC1_SIMR && (irq > 63))
>                  __raw_writeb(irq - 64, MCFINTC1_SIMR);
>          else
>                  __raw_writeb(irq, MCFINTC0_SIMR);
> +
> +       arch_local_irq_restore(flags);
>   }
> 
>   static void intc_irq_unmask(struct irq_data *d)
> 
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ