lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1218658873.4336.15.camel@localhost>
Date:	Wed, 13 Aug 2008 22:21:13 +0200
From:	Milan Plzik <milan.plzik@...il.com>
To:	"Pallipadi, Venkatesh" <venkatesh.pallipadi@...el.com>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: Possible CPU_IDLE bug [WAS: Re: Timer unstability on when
	using C2	and deeper sleep states (Dell Latitude XT)]

On St, 2008-08-13 at 11:14 -0700, Pallipadi, Venkatesh wrote:
> 
> >-----Original Message-----
> >From: linux-kernel-owner@...r.kernel.org
> >[mailto:linux-kernel-owner@...r.kernel.org] On Behalf Of Milan Plzik
> >Sent: Wednesday, August 13, 2008 7:16 AM
> >To: linux-kernel@...r.kernel.org
> >Subject: Possible CPU_IDLE bug [WAS: Re: Timer unstability on
> >when using C2 and deeper sleep states (Dell Latitude XT)]
> >
> >  Hello again,
> >
> >On St, 2008-08-13 at 12:58 +0200, Milan Plzik wrote:
> >> I apologize for replying on my own mail (and also for
> >top-posting, but
> >> this information is global update, not exactly fitting any of topics
> >> mentioned below).
> >>
> >>   After playing for a longer while I found out that the system ends
> >> sometimes in state where, in order to do anything useful, I need to
> >> press keys on keyboard. Otherwise, the system just stalls and does
> >> nothing. I have no idea why does this happen (especially when I know
> >> that OHCI or wireless network adapter produce fair amount of
> >> interrupts). My /proc/interrupts is below. Just for the record,
> >> chipset
> >> on board is ATI RS600 with (apparently from lspci) ATI SB600
> >> southbridge.
> >
> >  it looks like this problem disappears if CONFIG_CPU_IDLE option is
> >disabled, system seems to be stable for more than one hour. This
> >suggests that something may be wrong with the CPU_IDLE code. I can not
> >spend much more time by debugging the kernel, but if anyone has an
> >suggestion about what to fix, I will gladly test it.
> >
> >  Best regards,
> >        Milan Plzik
> 
> It may not be a problem with cpuidle code per se. We have had issues earlier like this one
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=10011
> 
> Cpuidle tries to go to C3 state aggressively and thus may be indirectly causing the problem with graphics hardware or something like that.
> 
> From Dave's comment in the above bugzilla:
> can you try with Option "DRI" "Off" in your xorg.conf
> 
> Does that change anything?

  The DRI flag itself seems to have little to no effect on what actually
happens. I noticed that the problems are really visible with
CONFIG_HZ_1000 and no preemption, other settings seem to blur the
problem a little (but it seems to be still there). I did some additional
testing, below are the results. Testing programs were: powertop (ran
immediately after booting), X server startup and starting mplayer with
some videos.

1) plain boot witho processor.max_cstate not set, DRI off:
  (boot process seemed to stall here and there)
  a) powertop on console (before running X server) returns bogus
numbers, like 20000 wakeups/sec.
  b) starting X server -- succeeds, but only after tapping keys on
keyboard, otherwise seems to stall. 
  c) mplayer seems to get stuck here and there, keypresses help and it
is able to play a little more of the video for a while.
  d) additional observation: keyboard autorepeat stopped (mostly)
working, though it was enabled in both X server and console

2) processor.max_cstate=2, DRI off
  a) powertop on console starts giving rational numbers, such as 300
wakeups/sec
  b) X server seems to start correctly
  c) mplayer seems to play files for a while, then it starts flickering
as if it wasn't able to keep up with speed; at the same time powertop
reports 90% of time spent in C2

2a) processor.max_cstate=2, DRI on (just changed X server configuration
without reboot)
  video playback seems to be more stable, but that might be just GPU
acceleration  

3) processor.max_cstate=2, DRI on after cold reboot
  symptoms like with attempt 1), but powertop returns correct numbers

4) processor.max_cstate=1, DRI on
  in this state I'm writing this e-mail and so far seems to be stable :)

<guess>
  I can just guess what causes these problems... . 1) might seem like
improper timer setup after resuming from C3 (at least that would explain
that weird powertop numbers).

  The issue with keyboard needing to be pressed seems more like some
race condition, when sometimes the interrupts are not properly enabled
-- sometimes it works, sometimes not.
</guess>

  I hope these results will help at least a little. If something other
is neccessary, I'll try to do it ASAP.

> 
> Thanks,
> Venki

  Thank you, :)
	Milan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ