lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5341707F.5000406@katamail.com>
Date:	Sun, 06 Apr 2014 17:19:27 +0200
From:	Michele Ballabio <barra_cuda@...amail.com>
To:	linux-kernel@...r.kernel.org
CC:	toralf.foerster@....de, fweisbec@...il.com, mingo@...nel.org,
	peterz@...radead.org
Subject: Bisected KVM hang on x86-32 between v3.12 and v3.13

Toralf Förster reported this in
  http://article.gmane.org/gmane.linux.kernel/1662567
  http://article.gmane.org/gmane.linux.kernel/1658422
  http://article.gmane.org/gmane.linux.kernel/1657962

  "The issue happens here at a 32 bit stable Gentoo Linux if
   I try to start a KVM image. Kernels 3.12.X works fine,
   kernel >= v3.13 will hang shortly after I started the image
   with the virtual-manager. The last syslog messages are
   something like:
   Feb 28 16:22:00 n22 kernel: INFO: rcu_sched detected stalls
       on CPUs/tasks: {} (detected by 2, t=60002 jiffies,
       g=14689, c=14688, q=21051)
   Feb 28 16:22:00 n22 kernel: INFO: Stall ended before state
       dump start"

He correctly pointed out that the bisection blamed the merge
commit 37bf06375c90a42fe07b9bebdb07bc316ae5a0ce
"Merge tag 'v3.12-rc4' into sched/core".

This bug is obviously caused by at least two patches, one
on each side of the merge, that only when combined together
(at that merge point) cause the bug in kvm. By rebasing
the "sched/core" branch on "master" before the merge and
going on with the bisection, I found commit
3e8e42c69bb7d9fc12ebc23ff308e8523a2a59a0
"sched: Revert need_resched() to look at TIF_NEED_RESCHED"
as one of the causes. The other patch that contributes to the
bug is commit ded797547548a5b8e7b92383a41e4c0e6b0ecb7f
"irq: Force hardirq exit's softirq processing on its own stack".

Reverting either one of them solves the problem reported with kvm,
but revert is probably not the correct answer.

I wonder if the solution is as simple as this:

--->8---
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0af5250..f3b985d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -126,6 +126,7 @@ config X86
 	select RTC_LIB
 	select HAVE_DEBUG_STACKOVERFLOW
 	select HAVE_IRQ_EXIT_ON_IRQ_STACK if X86_64
+	select HAVE_IRQ_EXIT_ON_IRQ_STACK if X86_32
 	select HAVE_CC_STACKPROTECTOR

 config INSTRUCTION_DECODER
---8<---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ