lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54D304DE.90506@linux.intel.com>
Date:	Thu, 05 Feb 2015 13:51:26 +0800
From:	Jiang Liu <jiang.liu@...ux.intel.com>
To:	Joerg Roedel <jroedel@...e.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Prarit Bhargava <prarit@...hat.com>,
	Yinghai Lu <yinghai@...nel.org>
CC:	x86@...nel.org, linux-kernel@...r.kernel.org, alnovak@...e.com,
	joro@...tes.org
Subject: Re: [PATCH] x86: irq: Check for valid irq descriptor in check_irq_vectors_for_cpu_disable

On 2015/2/4 21:27, Joerg Roedel wrote:
> Hi,
> 
> here is a patch to fix a kernel panic at shutdown we have seen recently.
> The issue is hard to reproduce, so the patch description about the root
> cause of the bug comes only from review and my current understanding of
> x86 irq code.
> 
> So if what I wrote in the patch description doesn't make sense, please
> let me know.
> 
> Constructive comments and feedback welcome.
> 
> Thanks,
> 
> 	Joerg
> 
> From 153e8f6cf39c42dd9767eb49a27eacfb69738fb0 Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <jroedel@...e.de>
> Date: Wed, 4 Feb 2015 13:33:33 +0100
> Subject: [PATCH] x86: irq: Check for valid irq descriptor in
>  check_irq_vectors_for_cpu_disable
> 
> When an interrupt is migrated away from a cpu it will stay
> in its vector_irq array until smp_irq_move_cleanup_interrupt
> succeeded. The cfg->move_in_progress flag is cleared already
> when the IPI was sent.
> 
> When the interrupt is destroyed after migration its 'struct
> irq_desc' is freed and the vector_irq arrays are cleaned up.
> But since cfg->move_in_progress is already 0 the references
> at cpus before the last migration will not be cleared. So
> this would leave a reference to an already destroyed irq
> alive.
> 
> When the cpu is taken down at this point, the
> check_irq_vectors_for_cpu_disable function finds a valid irq
> number in the vector_irq array, but gets NULL for its
> descriptor and dereferences it, causing a kernel panic.
> 
> This has been observed on real systems at shutdown. Add a
> check to check_irq_vectors_for_cpu_disable for a valid
> 'struct irq_desc' to prevent this issue.
> 
> Signed-off-by: Joerg Roedel <jroedel@...e.de>
Reviewed-by: Jiang Liu <jiang.liu@...ux.intel.com>

Actually there's another racing pattern.
for (irq = 0; irq < nr_irqs; irq++) {
	desc = irq_to_desc(irq);
	access desc->xxx
}

When sparsing IRQ is enabled, there's no mechanism to protect
desc returned by irq_to_desc(). Once I have considered a brute
solution of disabling freeing of irq_desc:(
Regards!
Gerry
> ---
>  arch/x86/kernel/irq.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 705ef8d..67b1cbe 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -302,6 +302,9 @@ int check_irq_vectors_for_cpu_disable(void)
>  		irq = __this_cpu_read(vector_irq[vector]);
>  		if (irq >= 0) {
>  			desc = irq_to_desc(irq);
> +			if (!desc)
> +				continue;
> +
>  			data = irq_desc_get_irq_data(desc);
>  			cpumask_copy(&affinity_new, data->affinity);
>  			cpu_clear(this_cpu, affinity_new);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ