linux-kernel - Re: Regression introduced with 14e568e78f6f80ca1e27256641ddf524c7dbdc51 (stop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.02.1302261320120.22263@ionos>
Date:	Tue, 26 Feb 2013 13:36:36 +0100 (CET)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
cc:	linux-kernel@...r.kernel.org, xen-devel@...ts.xensource.com
Subject: Re: Regression introduced with 14e568e78f6f80ca1e27256641ddf524c7dbdc51
 (stop_machine: Use smpboot threads)

On Fri, 22 Feb 2013, Konrad Rzeszutek Wilk wrote:
> 
> I don't know if this is b/c the Xen code is missing something or
> expects something that never happend. I hadn't looked at your
> patch in any detail (was going to do that on Monday).
> 
> Either way, if I boot a HVM guest with PV extensions (aka PVHVM)
> this is I what get:
> [    0.133081] cpu 1 spinlock event irq 71
> [    0.134049] smpboot: Booting Node   0, Processors  #1[    0.008000] installing Xen timer for CPU 1
> [    0.205154] Brought up 2 CPUs
> [    0.205156] smpboot: Total of 2 processors activated (16021.74 BogoMIPS)
> 
> [   28.134000] BUG: soft lockup - CPU#0 stuck for 23s! [migration/0:8]
> [   28.134000] Modules linked in:
> [   28.134000] CPU 0 
> [   28.134000] Pid: 8, comm: migration/0 Tainted: G        W    3.8.0upstream-06472-g6661875-dirty #1 Xen HVM domU
> [   28.134000] RIP: 0010:[<ffffffff8110711b>]  [<ffffffff8110711b>] stop_machine_cpu_stop+0x7b/0xf0

So the migration thread loops in stop_machine_cpu_stop(). Now the
interesting question is what work was scheduled for that cpu.

The main difference between the old code and the new one, is that the
thread is created earlier and not detroyed on cpu offline.

Could you add some instrumentation, so we can see what kind of cpu
stop work is scheduled and from where?

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/