lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 27 Feb 2020 16:11:07 -0800
From:   Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        Keith Busch <keith.busch@...el.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org
Subject: Re: [PATCH v1 1/1] x86/apic/vector: Fix NULL pointer exception in
 irq_complete_move()


On 2/27/20 11:59 AM, Thomas Gleixner wrote:
> sathyanarayanan.kuppuswamy@...ux.intel.com writes:
>> If an IRQ is scheduled using generic_handle_irq() function in a non IRQ
>> path, the irq_regs per CPU variable will not be set. Hence calling
>> irq_complete_move() function in this scenario leads to NULL pointer
>> de-reference exception. One example for this issue is, triggering fake
>> AER errors using PCIe aer_inject framework. So add addition check for
> What?
>
> This is completely broken to begin with. You are fixing the wrong
> end. The broken commit is:
>
> 390e2db82480 ("PCI/AER: Abstract AER interrupt handling")
>
> I have to admit that it was already broken before that commit because
> calling just the interrupt handler w/o serialization is as wrong as it
> gets, but then calling a random function just because it's accessible
> and does not explode in the face is not much better.
>
>> [   58.368269]  handle_edge_irq+0x7d/0x1e0
>> [   58.368272]  generic_handle_irq+0x27/0x30
>> [   58.368278]  aer_inject_write+0x53a/0x720
>> [   58.368283]  __vfs_write+0x36/0x1b0
>> [   58.368289]  ? common_file_perm+0x47/0x130
>> [   58.368293]  ? security_file_permission+0x2e/0xf0
>> [   58.368295]  vfs_write+0xa5/0x180
>> [   58.368296]  ksys_write+0x52/0xc0
>> [   58.368300]  do_syscall_64+0x48/0x120
>> [   58.368307]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> Calling generic_handle_irq() through a sysfs write is in the worst case
> going to corrupt state and that NULL pointer dereference is just one
> particular effect which made this bogosity visible.
>
> Even if you "fixed" this particular case, invoking this when an
> interrupt affinity change is scheduled will also wreckage state. In the
> best case it will only trigger the already existing WARN_ON() in the MSI
> code when the interrupt in question is MSI and the invocation happens on
> the wrong CPU. But there are worse things which can happen.
>
> We are neither going to paper over it by just silently preventing this
> particular NULL pointer dereference nor are we going to sprinkle more
> checks all over the place just to deal with this. The interrupt delivery
> hardware trainwreck of x86 CPUs is fragile as hell and we have enough
> horrible code already to deal with that. No need for self inflicted
> horrors.
>
> The proper fix for this is below as it prevents the abuse of this
> interface.
>
> This will not break the AER error injection as it has been broken
> forever. It just makes sure that the brokeness is not propagating
> through the core code.
>
> The right thing to make AER injection work is to inject the interrupt
> via the retrigger mechanism, which will send an IPI. There is no core
> interface for this, but that's a solvable problem.
Thanks for the review. I better take the IPI approach to solve the problem.

Along with this patch, may be adding a comment to generic_handle_irq()
about who the expected callers should also prevent people from using
it in wrong way.
>
> Thanks,
>
>          tglx
>
> 8<-----------------
> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
> index 2c5676b0a6e7..d7c4a3b815a6 100644
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -556,6 +556,7 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
>   		irqd->chip_data = apicd;
>   		irqd->hwirq = virq + i;
>   		irqd_set_single_target(irqd);
> +		irqd_set_handle_enforce_irqctx(irqd);
>   		/*
>   		 * Legacy vectors are already assigned when the IOAPIC
>   		 * takes them over. They stay on the same vector. This is
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 3ed5a055b5f4..9315fbb87db3 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -211,6 +211,8 @@ struct irq_data {
>    * IRQD_CAN_RESERVE		- Can use reservation mode
>    * IRQD_MSI_NOMASK_QUIRK	- Non-maskable MSI quirk for affinity change
>    *				  required
> + * IRQD_HANDLE_ENFORCE_IRQCTX	- Enforce that handle_irq_*() is only invoked
> + *				  from actual interrupt context.
>    */
>   enum {
>   	IRQD_TRIGGER_MASK		= 0xf,
> @@ -234,6 +236,7 @@ enum {
>   	IRQD_DEFAULT_TRIGGER_SET	= (1 << 25),
>   	IRQD_CAN_RESERVE		= (1 << 26),
>   	IRQD_MSI_NOMASK_QUIRK		= (1 << 27),
> +	IRQD_HANDLE_ENFORCE_IRQCTX	= (1 << 28),
>   };
>   
>   #define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors)
> @@ -303,6 +306,16 @@ static inline bool irqd_is_single_target(struct irq_data *d)
>   	return __irqd_to_state(d) & IRQD_SINGLE_TARGET;
>   }
>   
> +static inline void irqd_set_handle_enforce_irqctx(struct irq_data *d)
> +{
> +	__irqd_to_state(d) |= IRQD_HANDLE_ENFORCE_IRQCTX;
> +}
> +
> +static inline bool irqd_is_handle_enforce_irqctx(struct irq_data *d)
> +{
> +	return __irqd_to_state(d) & IRQD_HANDLE_ENFORCE_IRQCTX;
> +}
> +
>   static inline bool irqd_is_wakeup_set(struct irq_data *d)
>   {
>   	return __irqd_to_state(d) & IRQD_WAKEUP_STATE;
> diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
> index 3924fbe829d4..4561f971bc74 100644
> --- a/kernel/irq/internals.h
> +++ b/kernel/irq/internals.h
> @@ -427,6 +427,10 @@ static inline struct cpumask *irq_desc_get_pending_mask(struct irq_desc *desc)
>   {
>   	return desc->pending_mask;
>   }
> +static inline bool handle_enforce_irqctx(struct irq_data *data)
> +{
> +	return irqd_is_handle_enforce_irqctx(data);
> +}
>   bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear);
>   #else /* CONFIG_GENERIC_PENDING_IRQ */
>   static inline bool irq_can_move_pcntxt(struct irq_data *data)
> @@ -453,6 +457,10 @@ static inline bool irq_fixup_move_pending(struct irq_desc *desc, bool fclear)
>   {
>   	return false;
>   }
> +static inline bool handle_enforce_irqctx(struct irq_data *data)
> +{
> +	return false;
> +}
>   #endif /* !CONFIG_GENERIC_PENDING_IRQ */
>   
>   #if !defined(CONFIG_IRQ_DOMAIN) || !defined(CONFIG_IRQ_DOMAIN_HIERARCHY)
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index 98a5f10d1900..b3e9a66dd079 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -638,9 +638,15 @@ void irq_init_desc(unsigned int irq)
>   int generic_handle_irq(unsigned int irq)
>   {
>   	struct irq_desc *desc = irq_to_desc(irq);
> +	struct irq_data *data;
>   
>   	if (!desc)
>   		return -EINVAL;
> +
> +	data = irq_desc_get_irq_data(desc);
> +	if (WARN_ON_ONCE(!in_irq() && handle_enforce_irqctx(data)))
> +		return -EPERM;
> +
>   	generic_handle_irq_desc(desc);
>   	return 0;
>   }
>
>
>
-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ