[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <569C9B8C.9050406@gmail.com>
Date: Mon, 18 Jan 2016 16:00:12 +0800
From: zhuyj <zyjzyj2000@...il.com>
To: Huang Shijie <shijie.huang@....com>,
Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>,
Jiang Liu <jiang.liu@...ux.intel.com>,
Peter Zijlstra <peterz@...radead.org>, nd@....com
Subject: Re: [PATCH 1/1] Revert "genirq: Remove the second parameter from
handle_irq_event_percpu()"
Hi, all
I made tests for this patch. To now, I can not find any similar problem.
Best Regards!
Zhu Yanjun
On 01/14/2016 09:29 AM, Huang Shijie wrote:
> On Wed, Jan 13, 2016 at 02:07:25PM +0100, Thomas Gleixner wrote:
>> On Wed, 13 Jan 2016, zyjzyj2000@...il.com wrote:
>>
>>> After this commit 71f64340fc0e ("genirq: Remove the second parameter
>>> from handle_irq_event_percpu()") is applied, the variable action is
>>> not protected by raw_spin_lock. The following calltrace will pop up.
>> Thanks, for the report. I missed that detail when merging the patch!
>>
>> Just for correctness sake: You miss to explain why this can happen.
>>
>> It's not about the variable action, it's about desc->action not being
>> protected anymore. So the reason why this oopses is that the action is being
>> removed concurrently.
>>
>> CPU 0 CPU 1
>>
>> free_irq() lock(desc)
>> lock(desc) handle_edge_irq()
>> handle_irq_event(desc)
>> unlock(desc)
>> desc->action = NULL handle_irq_event_percpu(desc)
>> action = desc->action
>>
>> While the original code did:
>>
>> free_irq() lock(desc)
>> lock(desc) handle_edge_irq()
>> handle_irq_event()
>> action = desc->action
>> unlock(desc)
>> desc->action = NULL handle_irq_event_percpu(desc, action)
>>
>> So now the question is whether we revert that patch or simply change
>> handle_irq_event_percpu() to deal with that. Patch below.
>>
>> That preserves us the code size reduction of commit 71f64340fc0e. This is safe
>> because we either see a valid desc->action or NULL. If the action is about to
>> be removed it is still valid as free_irq() is blocked on synchronize_irq().
>>
>> free_irq() lock(desc)
>> lock(desc) handle_edge_irq()
>> handle_irq_event(desc)
>> set(INPROGRESS)
>> unlock(desc)
>> handle_irq_event_percpu(desc)
>> action = desc->action
>> desc->action = NULL
>> sychronize_irq()
>> while(INPROGRESS); lock(desc)
>> clr(INPROGRESS)
>> free(action)
>>
>> That's basically the same mechanism as we have for shared
>> interrupts. action->next can become NULL while handle_irq_event_percpu()
>> runs. Either it sees the action or NULL. It does not matter, because action
>> itself cannot go away.
>>
>> Thanks,
>>
>> tglx
>>
>> 8<-------------
>>
>> --- a/kernel/irq/handle.c
>> +++ b/kernel/irq/handle.c
>> @@ -136,9 +136,15 @@ irqreturn_t handle_irq_event_percpu(stru
>> {
>> irqreturn_t retval = IRQ_NONE;
>> unsigned int flags = 0, irq = desc->irq_data.irq;
>> - struct irqaction *action = desc->action;
>> + struct irqaction *action;
>>
>> - do {
>> + /*
>> + * READ_ONCE is not required here. The compiler cannot reload action
>> + * because it'll be action->next for the second iteration of the loop.
>> + */
>> + action = desc->action;
>> +
>> + while (action) {
>> irqreturn_t res;
>>
>> trace_irq_handler_entry(irq, action);
>> @@ -173,7 +179,7 @@ irqreturn_t handle_irq_event_percpu(stru
>>
>> retval |= res;
>> action = action->next;
>> - } while (action);
>> + }
>>
>> add_interrupt_randomness(irq, flags);
> I prefer to this patch, revert the old the patch is not a good solution.
>
> thanks
> Huang Shijie
>
Powered by blists - more mailing lists