lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 12 Jun 2011 22:30:05 -0400
From:	Ben Hutchings <bhutchings@...arflare.com>
To:	Andy Isaacson <adi@...apodia.org>
Cc:	"Paul E. McKenney" <paulmck@...ibm.com>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	linux-pm@...ts.linux-foundation.org
Subject: Re: rcu_sched_state detected stall on CPU 0, 3.0-rc2

On Sun, 2011-06-12 at 16:55 -0700, Andy Isaacson wrote:
> Let's CC netdev and linux-pm since this is obviously a suspend issue,
> and may have something to do with ethtool.
> 
> On Sun, Jun 12, 2011 at 04:11:43PM -0700, Andy Isaacson wrote:
> > On Sun, Jun 12, 2011 at 12:58:56PM -0700, Andy Isaacson wrote:
> > > My Thinkpad x201s threw some errors (?) a few minutes after resuming
> > > from suspend-to-ram this morning.
> > > 
> > > [56415.672140] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
> > > 
> > > Nothing jumps out of the backtraces at me.  Full dmesg and config
> > > attached.  This was my first StR since upgrading from 2.6.39, let's see
> > > if it fails again when I suspend after sending this email. :)
> > 
> > I haven't had a fully successful StR cycle yet (in 5 tries), although I
> > can't pin them all on RCU.  On try 2 it hung completely about 10 seconds
> > after I unlocked the screensaver, on try 3 it came back to a black
> > console, and on try 4 it didn't suspend at all (blinking moon LED but
> > battery LED and CPU fan still on).
> 
> Of course now that I'm trying to debug, I am seeing many successful
> suspend-resume cycles.  I don't see any signs of difference between the
> cases that hung and the cases that are now succeeding.
> 
> CCing netdev, because I suspend by running pm-suspend, and in at least
> one failure, an ethtool running under pm-suspend seemed to be the
> problem:
> 
> root 11558 pts/8    S+ \_ /bin/sh /usr/lib/pm-utils/sleep.d/00powers
> root 11559 pts/8    S+     \_ /bin/sh /usr/sbin/pm-powersave
> root 11576 pts/8    S+         \_ /bin/sh /usr/lib/pm-utils/power.d/
> root 11577 pts/8    D+             \_ ethtool -s eth0 wol g
[...]

Wake-on-LAN configuration is entirely handled by the relevant driver;
the ethtool core just copies the parameters in and out.  It looks like
there is some sort of deadlock or missing unlock in the driver.  So my
question would be which driver is running eth0?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ