linux-kernel - Re: BUG during shutdown - bisected to commit e2912009

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1262392920.32223.10.camel@laptop>
Date:	Sat, 02 Jan 2010 01:42:00 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Marc Dionne <marc.c.dionne@...il.com>
Cc:	linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
	Xiaotian Feng <dfeng@...hat.com>, Ingo Molnar <mingo@...e.hu>
Subject: Re: BUG during shutdown - bisected to commit e2912009

On Fri, 2010-01-01 at 19:27 -0500, Marc Dionne wrote:
> I'm getting a BUG with current kernels from
> kernel/time/clockevents.c:263 when halting the system - a restart
> behaves normally.  I don't have a good camera handy at the moment to
> capture the call stack on screen, but the call sequence is:
> 
> clockevents_notify
> hrtimer_cpu_notify
> notifier_call_chain
> raw_notifier_call_chain
> _cpu_down
> disable_nonboot_cpus
> kernel_power_off
> sys_reboot
> 
> I bisected it down to commit e2912009: sched: Ensure set_task_cpu() is
> never called on blocked tasks.  There were a few commits tested along
> the way where I got a freeze (with the power still on) instead of a
> BUG. Reverting that commit from the current kernel doesn't look
> trivial, but the commit immediately preceding this one does halt fine.

We somehow seem to trip up the below patch, which doesn't really make
sense, as I can't find how task placement would affect the below error.

It seems to purely test against the hot-unplugged cpu, not a cpu the
task is running on.

---
commit bb6eddf7676e1c1f3e637aa93c5224488d99036f
Author: Thomas Gleixner <tglx@...utronix.de>
Date:   Thu Dec 10 15:35:10 2009 +0100

    clockevents: Prevent clockevent_devices list corruption on cpu hotplug
    
    Xiaotian Feng triggered a list corruption in the clock events list on
    CPU hotplug and debugged the root cause.
    
    If a CPU registers more than one per cpu clock event device, then only
    the active clock event device is removed on CPU_DEAD. The unused
    devices are kept in the clock events device list.
    
    On CPU up the clock event devices are registered again, which means
    that we list_add an already enqueued list_head. That results in list
    corruption.
    
    Resolve this by removing all devices which are associated to the dead
    CPU on CPU_DEAD.
    
    Reported-by: Xiaotian Feng <dfeng@...hat.com>
    Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
    Tested-by: Xiaotian Feng <dfeng@...hat.com>
    Cc: stable@...nel.org

diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 20a8920..91db2e3 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -238,8 +238,9 @@ void clockevents_exchange_device(struct clock_event_device *old,
  */
 void clockevents_notify(unsigned long reason, void *arg)
 {
-	struct list_head *node, *tmp;
+	struct clock_event_device *dev, *tmp;
 	unsigned long flags;
+	int cpu;
 
 	spin_lock_irqsave(&clockevents_lock, flags);
 	clockevents_do_notify(reason, arg);
@@ -250,8 +251,19 @@ void clockevents_notify(unsigned long reason, void *arg)
 		 * Unregister the clock event devices which were
 		 * released from the users in the notify chain.
 		 */
-		list_for_each_safe(node, tmp, &clockevents_released)
-			list_del(node);
+		list_for_each_entry_safe(dev, tmp, &clockevents_released, list)
+			list_del(&dev->list);
+		/*
+		 * Now check whether the CPU has left unused per cpu devices
+		 */
+		cpu = *((int *)arg);
+		list_for_each_entry_safe(dev, tmp, &clockevent_devices, list) {
+			if (cpumask_test_cpu(cpu, dev->cpumask) &&
+			    cpumask_weight(dev->cpumask) == 1) {
+				BUG_ON(dev->mode != CLOCK_EVT_MODE_UNUSED);
+				list_del(&dev->list);
+			}
+		}
 		break;
 	default:
 		break;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/