linux-kernel - Re: [PATCH] nohz: Make tick_nohz_irq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAFTL4hx_uXfY=bH9nnEsN6AJFTcwqdAWTxcR0jc=hL1FBrwk0w@mail.gmail.com>
Date:	Wed, 20 Feb 2013 23:01:17 +0100
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Ingo Molnar <mingo@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>, stable@...r.kernel.org
Subject: Re: [PATCH] nohz: Make tick_nohz_irq_exit() irq safe

2013/2/20 Thomas Gleixner <tglx@...utronix.de>:
> On Wed, 20 Feb 2013, Frederic Weisbecker wrote:
>
>> As it stands, irq_exit() may or may not be called with
>> irqs disabled, depending on __ARCH_IRQ_EXIT_IRQS_DISABLED
>> that the arch can define.
>>
>> It makes tick_nohz_irq_exit() unsafe. For example two
>> interrupts can race in tick_nohz_stop_sched_tick(): the inner
>> most one computes the expiring time on top of the timer list,
>> then it's interrupted right before reprogramming the
>> clock. The new interrupt enqueues a new timer list timer,
>> it reprogram the clock to take it into account and it exits.
>> The CPUs resumes the inner most interrupt and performs the clock
>> reprogramming without considering the new timer list timer.
>>
>> This regression has been introduced by:
>>      280f06774afedf849f0b34248ed6aff57d0f6908
>>      ("nohz: Separate out irq exit and idle loop dyntick logic")
>>
>> Let's fix it right now with the appropriate protections.
>
> That's not a fix. That's an hack.

I know it looks that way. That's because it's a pure regression fix,
minimal for backportability.

I'm distinguishing two different things here: the fact that some archs
can call irq_exit() with interrupts enabled which is a global design
problem, and the fact that tick_nohz_irq_exit() was safe against that
until 3.2 when I broke it with a commit of mine.

My goal was basically to restore that protection in a minimal commit
such that we can backport the regression fix, then deal with
__ARCH_IRQ_EXIT_IRQS_DISABLED afterward, since it requires some more
invasive changes.

>> A saner long term solution will be to remove
>> __ARCH_IRQ_EXIT_IRQS_DISABLED.
>
> We really want to enforce that interrupt disabled condition for
> calling irq_exit(). So why make this exclusive to tick_nohz_irq_exit()?

I need a fix that I can backport. Is the below fine with a stable tag?
It looks a bit too invasive for the single regression involved.

Thanks.

>
> Thanks,
>
>         tglx
>
> Index: linux-2.6/kernel/softirq.c
> ===================================================================
> --- linux-2.6.orig/kernel/softirq.c
> +++ linux-2.6/kernel/softirq.c
> @@ -322,18 +322,10 @@ void irq_enter(void)
>
>  static inline void invoke_softirq(void)
>  {
> -       if (!force_irqthreads) {
> -#ifdef __ARCH_IRQ_EXIT_IRQS_DISABLED
> +       if (!force_irqthreads)
>                 __do_softirq();
> -#else
> -               do_softirq();
> -#endif
> -       } else {
> -               __local_bh_disable((unsigned long)__builtin_return_address(0),
> -                               SOFTIRQ_OFFSET);
> +       else
>                 wakeup_softirqd();
> -               __local_bh_enable(SOFTIRQ_OFFSET);
> -       }
>  }
>
>  /*
> @@ -341,6 +333,14 @@ static inline void invoke_softirq(void)
>   */
>  void irq_exit(void)
>  {
> +#ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
> +       unsigned long flags;
> +
> +       local_irq_save(flags);
> +#else
> +       BUG_ON(!irqs_disabled();
> +#endif
> +
>         account_irq_exit_time(current);
>         trace_hardirq_exit();
>         sub_preempt_count(IRQ_EXIT_OFFSET);
> @@ -354,6 +354,9 @@ void irq_exit(void)
>  #endif
>         rcu_irq_exit();
>         sched_preempt_enable_no_resched();
> +#ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
> +       local_irq_restore(flags);
> +#endif
>  }
>
>  /*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/