lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120925194250.GA25922@kroah.com>
Date:	Tue, 25 Sep 2012 12:42:50 -0700
From:	Greg KH <gregkh@...uxfoundation.org>
To:	Paweł Sikora <pluto@...-linux.org>
Cc:	linux-kernel@...r.kernel.org, arekm@...-linux.org,
	baggins@...-linux.org
Subject: Re: [3.5.4] rcu_sched self-detected stall on CPU { 1}  (t=54862991
 jiffies)

On Tue, Sep 25, 2012 at 07:04:19PM +0200, Paweł Sikora wrote:
> On Tuesday 25 of September 2012 09:44:54 Greg KH wrote:
> > On Tue, Sep 25, 2012 at 06:31:36PM +0200, Paweł Sikora wrote:
> > > On Monday 24 of September 2012 10:36:33 Greg KH wrote:
> > > > On Mon, Sep 24, 2012 at 10:05:23AM +0200, Paweł Sikora wrote:
> > > > > Hi,
> > > > > 
> > > > > with the new stable line i'm observing strange locks on my old amd-phenom-II mini-server.
> > > > > here's a dmesg:
> > > > 
> > > > Did this show up in 3.5.3?  If not, can you run 'git bisect' to find the
> > > > problem patch?
> > > 
> > > heh, the old good kernel put some light on this issue.
> > > 
> > > Sep 25 08:50:24 nexus kernel: [60330.301639] Clocksource tsc unstable (delta = -474690884 ns)
> > > Sep 25 08:50:24 nexus kernel: [60330.325477] ------------[ cut here ]------------
> > > Sep 25 08:50:24 nexus kernel: [60330.325484] WARNING: at /home/users/builder/rpm/BUILD/kernel-2.6.37.6/linux-2.6.37/net/sched/sch_generic.c:258 dev_watchdog+0x25d/0x270()
> > > Sep 25 08:50:24 nexus kernel: [60330.325486] Hardware name: GA-MA785GMT-UD2H
> > > Sep 25 08:50:24 nexus kernel: [60330.325487] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> > > (...)
> > > Sep 25 08:50:25 nexus kernel: [60330.851093] Switching to clocksource acpi_pm
> > > 
> > > afaics, this amd-phenom cpu does the cpu frequency scaling and causes plain 'tsc' timer
> > > instability which leads to network card watchdog timeout (i can login via local console
> > > while any network traffic is dead). on the recent 3.5.x kernel the 'clocksource unstable'
> > > message appears *after* 'task blocked' flood and there's no clear info about watchog timeout.
> > > currently i'm testing hpet clocksource becasue better tsc modes (constant_tsc, nonstop_tsc)
> > > aren't present in /sys/devices/system/clocksource/clocksource0/available_clocksource while
> > > cpu supports them.
> > 
> > I'm sorry, I don't understand, that's a 2.6.37 kernel you are comparing
> > this to.  Where did this problem show up?  In 3.5.4 where 3.5.3 was
> > fine?
> 
> 'cpu-stall' from topic has appeared in 3.5.2 (after upgrade from 3.4.10).

So, can you run 'git bisect' from 3.4.10 and 3.5.2 to find the commit
causing the problem?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ