lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 14 Jan 2016 09:29:19 +0800
From:	Huang Shijie <shijie.huang@....com>
To:	Thomas Gleixner <tglx@...utronix.de>
CC:	<zyjzyj2000@...il.com>, LKML <linux-kernel@...r.kernel.org>,
	Jiang Liu <jiang.liu@...ux.intel.com>,
	Peter Zijlstra <peterz@...radead.org>, <nd@....com>
Subject: Re: [PATCH 1/1] Revert "genirq: Remove the second parameter from
 handle_irq_event_percpu()"

On Wed, Jan 13, 2016 at 02:07:25PM +0100, Thomas Gleixner wrote:
> On Wed, 13 Jan 2016, zyjzyj2000@...il.com wrote:
> 
> > After this commit 71f64340fc0e ("genirq: Remove the second parameter
> > from handle_irq_event_percpu()") is applied, the variable action is
> > not protected by raw_spin_lock. The following calltrace will pop up.
> 
> Thanks, for the report. I missed that detail when merging the patch!
> 
> Just for correctness sake: You miss to explain why this can happen.
> 
> It's not about the variable action, it's about desc->action not being
> protected anymore. So the reason why this oopses is that the action is being
> removed concurrently.
> 
> CPU 0			CPU 1
> 
> free_irq()		lock(desc)
> lock(desc)		handle_edge_irq()
> 			  handle_irq_event(desc)
> 			    unlock(desc)
> desc->action = NULL	    handle_irq_event_percpu(desc)
> 	       		      action = desc->action
> 
> While the original code did:
> 
> free_irq()		lock(desc)
> lock(desc)		handle_edge_irq()
> 			  handle_irq_event()
> 	       		    action = desc->action
> 			    unlock(desc)
> desc->action = NULL	    handle_irq_event_percpu(desc, action)
> 	       		    
> So now the question is whether we revert that patch or simply change
> handle_irq_event_percpu() to deal with that. Patch below.
> 
> That preserves us the code size reduction of commit 71f64340fc0e. This is safe
> because we either see a valid desc->action or NULL. If the action is about to
> be removed it is still valid as free_irq() is blocked on synchronize_irq().
> 
> free_irq()		lock(desc)
> lock(desc)		handle_edge_irq()
> 			  handle_irq_event(desc)
> 			    set(INPROGRESS)
> 			    unlock(desc)
> 			      handle_irq_event_percpu(desc)
> 	       		        action = desc->action
> desc->action = NULL
> sychronize_irq()
>   while(INPROGRESS);	   lock(desc)
> 			   clr(INPROGRESS)
> free(action)
> 
> That's basically the same mechanism as we have for shared
> interrupts. action->next can become NULL while handle_irq_event_percpu()
> runs. Either it sees the action or NULL. It does not matter, because action
> itself cannot go away.
> 
> Thanks,
> 
> 	tglx
> 
> 8<-------------
> 
> --- a/kernel/irq/handle.c
> +++ b/kernel/irq/handle.c
> @@ -136,9 +136,15 @@ irqreturn_t handle_irq_event_percpu(stru
>  {
>  	irqreturn_t retval = IRQ_NONE;
>  	unsigned int flags = 0, irq = desc->irq_data.irq;
> -	struct irqaction *action = desc->action;
> +	struct irqaction *action;
>  
> -	do {
> +	/*
> +	 * READ_ONCE is not required here. The compiler cannot reload action
> +	 * because it'll be action->next for the second iteration of the loop.
> +	 */
> +	action = desc->action;
> +
> +	while (action) {
>  		irqreturn_t res;
>  
>  		trace_irq_handler_entry(irq, action);
> @@ -173,7 +179,7 @@ irqreturn_t handle_irq_event_percpu(stru
>  
>  		retval |= res;
>  		action = action->next;
> -	} while (action);
> +	}
>  
>  	add_interrupt_randomness(irq, flags);

I prefer to this patch, revert the old the patch is not a good solution.

thanks
Huang Shijie

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ