lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 30 Sep 2012 19:23:08 +0800 From: Fengguang Wu <fengguang.wu@...el.com> To: Avi Kivity <avi@...hat.com> Cc: paulmck@...ux.vnet.ibm.com, Josh Boyer <jwboyer@...hat.com>, Christian Hoffmann <email@...istianhoffmann.info>, LKML <linux-kernel@...r.kernel.org>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>, johnstul@...ibm.com, tglx@...utronix.de Subject: Re: INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=10002 jiffies) On Sun, Sep 30, 2012 at 01:10:55PM +0200, Avi Kivity wrote: > On 09/28/2012 05:35 AM, Paul E. McKenney wrote: > > On Thu, Sep 27, 2012 at 12:40:44PM +0800, Fengguang Wu wrote: > >> On Wed, Sep 26, 2012 at 09:28:50PM -0700, Paul E. McKenney wrote: > >> > On Thu, Sep 27, 2012 at 10:54:00AM +0800, Fengguang Wu wrote: > >> > > On Wed, Sep 26, 2012 at 09:45:43AM -0700, Paul E. McKenney wrote: > >> > > > On Wed, Sep 26, 2012 at 04:15:01PM +0800, Fengguang Wu wrote: > > > > [ . . . ] > > > >> > > > But could you also please send your .config file and a description of > >> > > > >> > > .config attached. > >> > > > >> > > > the workload you are running? > >> > > > >> > > It's basically the below commands. The exact initrd is not relevant in > >> > > this case because it's a boot time warning before user space is > >> > > started. The stalls roughly happen 1 time on every 10 boots. > >> > > >> > Yow!!! > >> > > >> > You have severe cross-CPU time-synchronization problems. See for > >> > example the first dmesg, with the relevant part extracted right here. > >> > One CPU believes that it is about 37 seconds past boot, and the other > >> > CPU beleives that it is about 137 seconds past boot. Given that large > >> > of a time difference, an RCU CPU stall warning is expected behavior. > >> > >> Good spot! Yeah I noticed that huge timestamp gap, however didn't take > >> it seriously enough.. > >> > >> > Get your two CPUs in agreement about what time it is, and I bet that > >> > the CPU stall warnings will go away. > >> > >> Possibly KVM related? Because the warnings show up in many test boxes > >> running KVM and so is not likely some hardware specific issue. > > > > I vaguely recall seeing something recently. But let's ask the KVM and > > timekeeping guys. > > >From the logs it looks like hpet (why not kvmclock?) is used for the > clock, it should not generate such drifts since it is a global clock. > Can you verify current_clocksource on a boot that actually failed (in > case the clocksource is switched during runtime)? I've checked out the dmesg that's cited by Paul, attached. Yes it contains lines [ 4.970051] Switching to clocksource hpet and then [ 7.250353] Switching to clocksource tsc And there is no kvm-clock lines. Oh well for this particular kernel: # CONFIG_KVM_CLOCK is not set I'm not sure how this happen, maybe some kconfig that CONFIG_KVM_CLOCK depends on is randconfig'ed to off.. Thanks, Fengguang View attachment "dmesg-kvm_bisect2-inn-42527-2012-09-27-10-38-38-3.6.0-rc7-bisect2-00078-g593d100-21" of type "text/plain" (307119 bytes) View attachment ".config" of type "text/plain" (69161 bytes)
Powered by blists - more mailing lists