lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 26 Feb 2015 14:05:20 +0100
From:	Gerlando Falauto <gerlando.falauto@...mile.com>
To:	Thomas Gleixner <tglx@...utronix.de>,
	Laurent Pinchart <laurent.pinchart@...asonboard.com>
CC:	linux-kernel@...r.kernel.org, rtc-linux@...glegroups.com,
	"Rafael J. Wysocki" <rjw@...ysocki.net>
Subject: Re: [PATCH/RFC] rtc: rtc-twl: Fixed nested IRQ handling in resume
 from suspend

Hi Thomas,

did anything ever come out of this?

I'm encountering a similar but different problem, where a nested 
interrupt handler is called directly from the resend tasklet (and 
therefore -- if I'm not mistaken -- in an atomic context, which is 
unexpected). This issue however appears during system startup, when the 
interrupt is first enabled, and is very, very hard to reproduce. So I 
don't have much room for further investigation.

Do you have any clue what might be the sequence of events leading to 
this behavior?

While I reckon that using irq_set_parent() on the nested IRQ might as 
well fix the issue (or just as well hide it), I just don't understand 
how this all works. Plus, I wonder why it is (almost) never used... I 
only found two occurrences of irq_set_parent() on 3.19-rc5 (gpiolib.c 
and twl6030-irq.c), whereas exactly zero on 3.10.60.
Is there any known way to trigger/test interrupt resending (which, I 
believe, has been around for a very long time)?

Thank you!
Gerlando

On 09/17/2014 11:57 PM, Thomas Gleixner wrote:
> On Sat, 13 Sep 2014, Laurent Pinchart wrote:
>
>> The TWL RTC interrupt is a double-nested threaded interrupt, handled
>> through the TWL SIH (Secondary Interrupt Handler) and PIH (Primary
>> Interrupt Handler).
>>
>> When the system is woken up from suspend by a TWL RTC alarm interrupt,
>> the TWL PIH and SIH are enabled first (due to the normal IRQ enabling
>> sequence for the PIH and to the IRQF_EARLY_RESUME flag for the SIH)
>> before the TWL RTC interrupt gets enabled. This results on the interrupt
>> being processed by the TWL primary interrupt handler, forwarded to the
>> nested SIH, and then marked as pending for the RTC by handle_nested_irq
>> called from the SIH.
>>
>> The RTC interrupt then eventually gets reenabled the kernel, which will
>> try to call its top half interrupt handler. As the interrupt is a nested
>> threaded IRQ, its primary handler has been set to the
>> irq_nested_primary_handler function, which is never supposed to be
>> called and generates a WARN_ON, without waking the IRQ thread up.
>
> Right. It CANNOT wake up the thread, because there is no thread
> associated to that particular interrupt. It's handler is called in the
> context of the parent (demultiplexing) interrupt thread. Of course
> twl4030 does not call irq_set_parent() for the nested
> interrupts. That's there so the resend of a nested thread irq will be
> targeted to its parent.
>
> Using IRQF_EARLY_RESUME is really, really wrong for device drivers
> simply because at the point where early resume is called the devices
> have not yet been resumed, so a interrupt delivered at this point
> might run into a stale device and cause a machine stall or any other
> undesired side effect. It was added for a special case with Xen where
> Xen needs the IPIs working early in resume. And it's definitely not
> meant to solve ordering issues of interrupts on resume.
>
> That said, looking at that twl4030 driver, there seems to be a double
> nesting issue. So also the simple one level parent redirection of the
> irq resend wont work. I really wonder why this only triggers in the
> context of resume.
>
> Now the resend issue is simple to fix. The resume time ordering
> constraints is a bit more involved, but it should be possible w/o
> inflicting anything more complex on drivers than requiring them to use
> irq_set_parent(), which should be name irq_set_nested_parent().
>
> Completely untested patch below. It applies on top of
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/pm
>
> So what twl4030 needs on top of that are calls to
> irq_set_nested_parent() for the nested interrupts.
>
> That will automatically establish the nesting depth, redirect the
> retrigger to the top most interrupt and solve the resume ordering.
>
> The resume ordering is the reverse of the nesting:
>
>      top-irq1 - nested irq10   - nested irq20 (parent = 10)
>      	     | (parent = 1)   - nested irq21 (parent = 10)
> 	     |
>      	     - nested irq11   - nested irq22 (parent = 11)	
> 	     | (parent = 1)   - nested irq23 (parent = 11)
> 	     |
> 	     - nested irq12   - nested irq24 (parent = 12)
> 	       (parent = 1)   - nested irq25 (parent = 12)
>
> So the resume ordering is
>
>     20-21-22-23-24-25 - 10-11-12 - 1
>
> Thanks,
>
> 	tglx
>
> ---
>   drivers/mfd/twl6030-irq.c |    3 --
>   include/linux/irq.h       |    9 ------
>   include/linux/irqdesc.h   |    5 +++
>   kernel/irq/manage.c       |   28 ++++++++++++++++++---
>   kernel/irq/pm.c           |   60 +++++++++++++++++++++++++++++++++++++++++-----
>   kernel/irq/resend.c       |    4 +--
>   kernel/irq/settings.h     |    5 +++
>   7 files changed, 92 insertions(+), 22 deletions(-)
>
> Index: tip/drivers/mfd/twl6030-irq.c
> ===================================================================
> --- tip.orig/drivers/mfd/twl6030-irq.c
> +++ tip/drivers/mfd/twl6030-irq.c
> @@ -350,8 +350,7 @@ static int twl6030_irq_map(struct irq_do
>
>   	irq_set_chip_data(virq, pdata);
>   	irq_set_chip_and_handler(virq,  &pdata->irq_chip, handle_simple_irq);
> -	irq_set_nested_thread(virq, true);
> -	irq_set_parent(virq, pdata->twl_irq);
> +	irq_set_nested_parent(virq, pdata->twl_irq);
>
>   #ifdef CONFIG_ARM
>   	/*
> Index: tip/include/linux/irq.h
> ===================================================================
> --- tip.orig/include/linux/irq.h
> +++ tip/include/linux/irq.h
> @@ -415,14 +415,7 @@ static inline void irq_move_masked_irq(s
>
>   extern int no_irq_affinity;
>
> -#ifdef CONFIG_HARDIRQS_SW_RESEND
> -int irq_set_parent(int irq, int parent_irq);
> -#else
> -static inline int irq_set_parent(int irq, int parent_irq)
> -{
> -	return 0;
> -}
> -#endif
> +int irq_set_nested_parent(int irq, int parent_irq);
>
>   /*
>    * Built-in IRQ handlers for various IRQ types,
> Index: tip/include/linux/irqdesc.h
> ===================================================================
> --- tip.orig/include/linux/irqdesc.h
> +++ tip/include/linux/irqdesc.h
> @@ -41,6 +41,9 @@ struct irq_desc;
>    *			IRQF_NO_SUSPEND set
>    * @force_resume_depth:	number of irqactions on a irq descriptor with
>    *			IRQF_FORCE_RESUME set
> + * @parent_irq:		Parent interrupt in case of nested threads
> + * @parent_top:		Top parent interrupt in case of multiple nested threads
> + * @parent_depth:	Nest level of multiple nested threads for resume ordering
>    * @dir:		/proc/irq/ procfs entry
>    * @name:		flow handler name for /proc/interrupts output
>    */
> @@ -82,6 +85,8 @@ struct irq_desc {
>   	struct proc_dir_entry	*dir;
>   #endif
>   	int			parent_irq;
> +	int			parent_depth;
> +	int			parent_top;
>   	struct module		*owner;
>   	const char		*name;
>   } ____cacheline_internodealigned_in_smp;
> Index: tip/kernel/irq/manage.c
> ===================================================================
> --- tip.orig/kernel/irq/manage.c
> +++ tip/kernel/irq/manage.c
> @@ -624,21 +624,41 @@ int __irq_set_trigger(struct irq_desc *d
>   	return ret;
>   }
>
> -#ifdef CONFIG_HARDIRQS_SW_RESEND
> -int irq_set_parent(int irq, int parent_irq)
> +int irq_set_nested_parent(int irq, int parent_irq)
>   {
> +	struct irq_desc *desc;
>   	unsigned long flags;
> -	struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0);
> +	int parent_depth, parent_top;
>
> +	/*
> +	 * Get the parent irq first to retrieve the parent depth and
> +	 * the top level parent irq number. The depth is required for
> +	 * resume ordering, the top level irq number for software
> +	 * resend.
> +	 */
> +	desc = irq_get_desc_lock(parent_irq, &flags, 0);
> +	if (!desc)
> +		return -EINVAL;
> +	parent_depth = desc->parent_depth;
> +	parent_top = desc->parent_top;
> +	irq_put_desc_unlock(desc, flags);
> +
> +	/* Setup the info in the child */
> +	desc = irq_get_desc_lock(parent_irq, &flags, 0);
>   	if (!desc)
>   		return -EINVAL;
>
>   	desc->parent_irq = parent_irq;
> +	desc->parent_top = parent_top ? parent_top : parent_irq;
> +	desc->parent_depth = parent_depth + 1;
> +
> +	/* Set the nested thread bit as well */
> +	irq_settings_set_nested_thread(desc);
>
>   	irq_put_desc_unlock(desc, flags);
> +
>   	return 0;
>   }
> -#endif
>
>   /*
>    * Default primary interrupt handler for threaded interrupts. Is
> Index: tip/kernel/irq/pm.c
> ===================================================================
> --- tip.orig/kernel/irq/pm.c
> +++ tip/kernel/irq/pm.c
> @@ -8,6 +8,7 @@
>
>   #include <linux/irq.h>
>   #include <linux/module.h>
> +#include <linux/bitmap.h>
>   #include <linux/interrupt.h>
>   #include <linux/suspend.h>
>   #include <linux/syscore_ops.h>
> @@ -93,6 +94,37 @@ static bool suspend_device_irq(struct ir
>   	return true;
>   }
>
> +#define IRQ_NEST_DEPTH 4
> +
> +struct resume_level {
> +	unsigned long	map[BITS_TO_LONGS(IRQ_BITMAP_BITS)];
> +	int		top;
> +};
> +
> +static struct resume_level resume_order_lvls[IRQ_NEST_DEPTH];
> +static struct resume_level resume_early_lvl;
> +static int resume_order_max_depth;
> +
> +static void update_resume_order(struct irq_desc *desc, int irq)
> +{
> +	bool early = desc->action && desc->action->flags & IRQF_EARLY_RESUME;
> +	int depth = desc->parent_depth;
> +	struct resume_level *l;
> +
> +	if (!early) {
> +		if (WARN_ON_ONCE(depth >= IRQ_NEST_DEPTH))
> +			depth = IRQ_NEST_DEPTH - 1;
> +		l = resume_order_lvls + depth;
> +		if (depth > resume_order_max_depth)
> +			resume_order_max_depth = depth;
> +	} else {
> +		WARN_ON_ONCE(depth != 0);
> +		l = &resume_early_lvl;
> +	}
> +	set_bit(irq, l->map);
> +	l->top = irq;
> +}
> +
>   /**
>    * suspend_device_irqs - disable all currently enabled interrupt lines
>    *
> @@ -114,11 +146,16 @@ void suspend_device_irqs(void)
>   	struct irq_desc *desc;
>   	int irq;
>
> +	memset(resume_order_lvls, 0, sizeof(resume_order_lvls));
> +	memset(resume_early_lvl, 0, sizeof(resume_early_lvl));
> +	resume_order_max_depth = 0;
> +
>   	for_each_irq_desc(irq, desc) {
>   		unsigned long flags;
>   		bool sync;
>
>   		raw_spin_lock_irqsave(&desc->lock, flags);
> +		update_resume_order(desc, irq);
>   		sync = suspend_device_irq(desc, irq);
>   		raw_spin_unlock_irqrestore(&desc->lock, flags);
>
> @@ -146,17 +183,16 @@ resume:
>   	__enable_irq(desc, irq);
>   }
>
> -static void resume_irqs(bool want_early)
> +static void resume_irq_lvl(struct resume_level *l)
>   {
>   	struct irq_desc *desc;
> +	unsigned long flags;
>   	int irq;
>
> -	for_each_irq_desc(irq, desc) {
> -		unsigned long flags;
> -		bool is_early = desc->action &&
> -			desc->action->flags & IRQF_EARLY_RESUME;
> +	for_each_set_bit(irq, l->map, l->top + 1) {
>
> -		if (!is_early && want_early)
> +		desc = irq_to_desc(irq);
> +		if (WARN_ON_ONCE(!desc))
>   			continue;
>
>   		raw_spin_lock_irqsave(&desc->lock, flags);
> @@ -165,6 +201,18 @@ static void resume_irqs(bool want_early)
>   	}
>   }
>
> +static void resume_irqs(bool early)
> +{
> +	int i;
> +
> +	if (early) {
> +		resume_irq_lvl(&resume_early_lvl);
> +	} else {
> +		for (i = resume_order_max_depth; i >= 0; i--)
> +			resume_irq_lvl(resume_order_lvls + i);
> +	}
> +}
> +
>   /**
>    * irq_pm_syscore_ops - enable interrupt lines early
>    *
> Index: tip/kernel/irq/resend.c
> ===================================================================
> --- tip.orig/kernel/irq/resend.c
> +++ tip/kernel/irq/resend.c
> @@ -79,9 +79,9 @@ void check_irq_resend(struct irq_desc *d
>   			 * in the thread context of the parent irq,
>   			 * retrigger the parent.
>   			 */
> -			if (desc->parent_irq &&
> +			if (desc->parent_top &&
>   			    irq_settings_is_nested_thread(desc))
> -				irq = desc->parent_irq;
> +				irq = desc->parent_top;
>   			/* Set it pending and activate the softirq: */
>   			set_bit(irq, irqs_resend);
>   			tasklet_schedule(&resend_tasklet);
> Index: tip/kernel/irq/settings.h
> ===================================================================
> --- tip.orig/kernel/irq/settings.h
> +++ tip/kernel/irq/settings.h
> @@ -53,6 +53,11 @@ static inline void irq_settings_set_per_
>   	desc->status_use_accessors |= _IRQ_PER_CPU;
>   }
>
> +static inline void irq_settings_set_nested_thread(struct irq_desc *desc)
> +{
> +	desc->status_use_accessors |= _IRQ_NESTED_THREAD;
> +}
> +
>   static inline void irq_settings_set_no_balancing(struct irq_desc *desc)
>   {
>   	desc->status_use_accessors |= _IRQ_NO_BALANCING;
>
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ