linux-kernel - Re: [PATCH 32/35] clockevents: Fix cpu down race for hrtimer based broadcasting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-id: <alpine.LFD.2.11.1502191124510.22104@knanqh.ubzr>
Date:	Thu, 19 Feb 2015 12:51:52 -0500 (EST)
From:	Nicolas Pitre <nico@...xnic.net>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, mingo@...nel.org, rjw@...ysocki.net,
	tglx@...utronix.de, Preeti U Murthy <preeti@...ux.vnet.ibm.com>
Subject: Re: [PATCH 32/35] clockevents: Fix cpu down race for hrtimer based
 broadcasting

On Mon, 16 Feb 2015, Peter Zijlstra wrote:

> From: Thomas Gleixner <tglx@...utronix.de>
> 
> Preeti reported a cpu down race with hrtimer based broadcasting:
> 
> Assume CPU1 is the CPU which holds the hrtimer broadcasting duty
> before it is taken down.
> 
> CPU0				CPU1
> cpu_down()
> 				takedown_cpu()
> 			   	  disable_interrupts()
> cpu_die()
>   while (CPU1 != DEAD) {
>     msleep(100);
>       switch_to_idle()
>         stop_cpu_timer()
>           schedule_broadcast()
>   }
> 
> tick_cleanup_dead_cpu()
>      take_over_broadcast()	
> 
> So after CPU1 disabled interrupts it cannot handle the broadcast
> hrtimer anymore, so CPU0 will be stuck forever.
> 
> Doing a "while (CPU1 != DEAD) msleep(100);" periodic poll is silly at
> best, but we need to fix that nevertheless.
> 
> Split the tick cleanup into two pieces:
> 
> 1) Shutdown and remove all per cpu clockevent devices from
>    takedown_cpu()
> 
>    This is done carefully with respect to existing arch code which
>    works around the shortcoming of the clockevents core code in
>    interesting ways. We really want a separate callback for this to
>    cleanup the workarounds, but that's not scope of this patch
> 
> 2) Takeover the broadcast duty explicitely before calling cpu_die()
> 
>    This is a temporary workaround as well. What we really want is a
>    callback in the clockevent device which allows us to do that from
>    the dying CPU by pushing the hrtimer onto a different cpu. That
>    might involve an IPI and is definitely more complex than this
>    immediate fix.
> 
> Reported-by: Preeti U Murthy <preeti@...ux.vnet.ibm.com>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>

This breaks the b.L switcher disabling code which essentially does:

static void bL_switcher_restore_cpus(void)
{
        int i;

        for_each_cpu(i, &bL_switcher_removed_logical_cpus) {
                struct device *cpu_dev = get_cpu_device(i);
                int ret = device_online(cpu_dev);
                if (ret)
                        dev_err(cpu_dev, "switcher: unable to restore CPU\n");
        }
}

However, as soon as one new CPU becomes online, the following crash 
occurs on that CPU:

[  547.858031] ------------[ cut here ]------------
[  547.871868] kernel BUG at kernel/time/hrtimer.c:1249!
[  547.886991] Internal error: Oops - BUG: 0 [#1] SMP THUMB2
[  547.903155] Modules linked in:
[  547.912303] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.19.0-rc5-00058-gdd7a65fbc5 #527
[...]
[  548.599060] [<c005a1c2>] (hrtimer_interrupt) from [<c00639db>] (tick_do_broadcast.constprop.8+0x8f/0x90)
[  548.627482] [<c00639db>] (tick_do_broadcast.constprop.8) from [<c0063acd>] (tick_handle_oneshot_broadcast+0xf1/0x168)
[  548.659290] [<c0063acd>] (tick_handle_oneshot_broadcast) from [<c001a07f>] (sp804_timer_interrupt+0x2b/0x30)
[  548.688755] [<c001a07f>] (sp804_timer_interrupt) from [<c004ee9b>] (handle_irq_event_percpu+0x37/0x130)
[  548.716916] [<c004ee9b>] (handle_irq_event_percpu) from [<c004efc7>] (handle_irq_event+0x33/0x48)
[  548.743511] [<c004efc7>] (handle_irq_event) from [<c0050c1d>] (handle_fasteoi_irq+0x69/0xe4)
[  548.768804] [<c0050c1d>] (handle_fasteoi_irq) from [<c004e835>] (generic_handle_irq+0x1d/0x28)
[  548.794619] [<c004e835>] (generic_handle_irq) from [<c004ea17>] (__handle_domain_irq+0x3f/0x80)
[  548.820694] [<c004ea17>] (__handle_domain_irq) from [<c00084f5>] (gic_handle_irq+0x21/0x4c)
[  548.845729] [<c00084f5>] (gic_handle_irq) from [<c04521db>] (__irq_svc+0x3b/0x5c)

The corresponding code is:

void hrtimer_interrupt(struct clock_event_device *dev)
{
        struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
        ktime_t expires_next, now, entry_time, delta;
        int i, retries = 0;

        BUG_ON(!cpu_base->hres_active);
[...]

Reverting this patch "fixes" the problem.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/