lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1728491.bnoskTaLZT@localhost>
Date:	Tue, 25 Sep 2012 19:04:19 +0200
From:	Paweł Sikora <pluto@...-linux.org>
To:	Greg KH <gregkh@...uxfoundation.org>
Cc:	linux-kernel@...r.kernel.org, arekm@...-linux.org,
	baggins@...-linux.org
Subject: Re: [3.5.4] rcu_sched self-detected stall on CPU { 1}  (t=54862991 jiffies)

On Tuesday 25 of September 2012 09:44:54 Greg KH wrote:
> On Tue, Sep 25, 2012 at 06:31:36PM +0200, Paweł Sikora wrote:
> > On Monday 24 of September 2012 10:36:33 Greg KH wrote:
> > > On Mon, Sep 24, 2012 at 10:05:23AM +0200, Paweł Sikora wrote:
> > > > Hi,
> > > > 
> > > > with the new stable line i'm observing strange locks on my old amd-phenom-II mini-server.
> > > > here's a dmesg:
> > > 
> > > Did this show up in 3.5.3?  If not, can you run 'git bisect' to find the
> > > problem patch?
> > 
> > heh, the old good kernel put some light on this issue.
> > 
> > Sep 25 08:50:24 nexus kernel: [60330.301639] Clocksource tsc unstable (delta = -474690884 ns)
> > Sep 25 08:50:24 nexus kernel: [60330.325477] ------------[ cut here ]------------
> > Sep 25 08:50:24 nexus kernel: [60330.325484] WARNING: at /home/users/builder/rpm/BUILD/kernel-2.6.37.6/linux-2.6.37/net/sched/sch_generic.c:258 dev_watchdog+0x25d/0x270()
> > Sep 25 08:50:24 nexus kernel: [60330.325486] Hardware name: GA-MA785GMT-UD2H
> > Sep 25 08:50:24 nexus kernel: [60330.325487] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> > (...)
> > Sep 25 08:50:25 nexus kernel: [60330.851093] Switching to clocksource acpi_pm
> > 
> > afaics, this amd-phenom cpu does the cpu frequency scaling and causes plain 'tsc' timer
> > instability which leads to network card watchdog timeout (i can login via local console
> > while any network traffic is dead). on the recent 3.5.x kernel the 'clocksource unstable'
> > message appears *after* 'task blocked' flood and there's no clear info about watchog timeout.
> > currently i'm testing hpet clocksource becasue better tsc modes (constant_tsc, nonstop_tsc)
> > aren't present in /sys/devices/system/clocksource/clocksource0/available_clocksource while
> > cpu supports them.
> 
> I'm sorry, I don't understand, that's a 2.6.37 kernel you are comparing
> this to.  Where did this problem show up?  In 3.5.4 where 3.5.3 was
> fine?

'cpu-stall' from topic has appeared in 3.5.2 (after upgrade from 3.4.10).
the 3.5.4 also has the same problem as 3.5.2, so i've went back to initial 2.6.37.6
which had worked fine for many months. now i'm pretty sure that all these problems
are related to tsc instability and appears on different kernels in different form.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ