[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1218701151.4285.9.camel@localhost>
Date: Thu, 14 Aug 2008 10:05:51 +0200
From: Milan Plzik <milan.plzik@...il.com>
To: "Pallipadi, Venkatesh" <venkatesh.pallipadi@...el.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>
Subject: RE: Possible CPU_IDLE bug [WAS: Re: Timer unstability on
when using C2 and deeper sleep states (Dell Latitude XT)]
On St, 2008-08-13 at 14:22 -0700, Pallipadi, Venkatesh wrote:
>
> >-----Original Message-----
> >From: Milan Plzik [mailto:milan.plzik@...il.com]
> >Sent: Wednesday, August 13, 2008 1:21 PM
> >To: Pallipadi, Venkatesh
> >Cc: linux-kernel@...r.kernel.org
> >Subject: RE: Possible CPU_IDLE bug [WAS: Re: Timer unstability
> >on when using C2 and deeper sleep states (Dell Latitude XT)]
> >
> >On St, 2008-08-13 at 11:14 -0700, Pallipadi, Venkatesh wrote:
> >>
> >> >-----Original Message-----
> >> >From: linux-kernel-owner@...r.kernel.org
> >> >[mailto:linux-kernel-owner@...r.kernel.org] On Behalf Of Milan Plzik
> >> >Sent: Wednesday, August 13, 2008 7:16 AM
> >> >To: linux-kernel@...r.kernel.org
> >> >Subject: Possible CPU_IDLE bug [WAS: Re: Timer unstability on
> >> >when using C2 and deeper sleep states (Dell Latitude XT)]
> >> >
> >> > Hello again,
> >> >
> >> >On St, 2008-08-13 at 12:58 +0200, Milan Plzik wrote:
> >> >> I apologize for replying on my own mail (and also for
> >> >top-posting, but
> >> >> this information is global update, not exactly fitting
> >any of topics
> >> >> mentioned below).
> >> >>
> >> >> After playing for a longer while I found out that the
> >system ends
> >> >> sometimes in state where, in order to do anything useful,
> >I need to
> >> >> press keys on keyboard. Otherwise, the system just stalls and does
> >> >> nothing. I have no idea why does this happen (especially
> >when I know
> >> >> that OHCI or wireless network adapter produce fair amount of
> >> >> interrupts). My /proc/interrupts is below. Just for the record,
> >> >> chipset
> >> >> on board is ATI RS600 with (apparently from lspci) ATI SB600
> >> >> southbridge.
> >> >
> >> > it looks like this problem disappears if CONFIG_CPU_IDLE option is
> >> >disabled, system seems to be stable for more than one hour. This
> >> >suggests that something may be wrong with the CPU_IDLE
> >code. I can not
> >> >spend much more time by debugging the kernel, but if anyone has an
> >> >suggestion about what to fix, I will gladly test it.
> >> >
> >> > Best regards,
> >> > Milan Plzik
> >>
> >> It may not be a problem with cpuidle code per se. We have
> >had issues earlier like this one
> >>
> >> http://bugzilla.kernel.org/show_bug.cgi?id=10011
> >>
> >> Cpuidle tries to go to C3 state aggressively and thus may be
> >indirectly causing the problem with graphics hardware or
> >something like that.
> >>
> >> From Dave's comment in the above bugzilla:
> >> can you try with Option "DRI" "Off" in your xorg.conf
> >>
> >> Does that change anything?
> >
> > The DRI flag itself seems to have little to no effect on
> >what actually
> >happens. I noticed that the problems are really visible with
> >CONFIG_HZ_1000 and no preemption, other settings seem to blur the
> >problem a little (but it seems to be still there). I did some
> >additional
> >testing, below are the results. Testing programs were: powertop (ran
> >immediately after booting), X server startup and starting mplayer with
> >some videos.
> >
> >1) plain boot witho processor.max_cstate not set, DRI off:
> > (boot process seemed to stall here and there)
> > a) powertop on console (before running X server) returns bogus
> >numbers, like 20000 wakeups/sec.
> > b) starting X server -- succeeds, but only after tapping keys on
> >keyboard, otherwise seems to stall.
> > c) mplayer seems to get stuck here and there, keypresses help and it
> >is able to play a little more of the video for a while.
> > d) additional observation: keyboard autorepeat stopped (mostly)
> >working, though it was enabled in both X server and console
> >
> >2) processor.max_cstate=2, DRI off
> > a) powertop on console starts giving rational numbers, such as 300
> >wakeups/sec
> > b) X server seems to start correctly
> > c) mplayer seems to play files for a while, then it starts flickering
> >as if it wasn't able to keep up with speed; at the same time powertop
> >reports 90% of time spent in C2
> >
> >2a) processor.max_cstate=2, DRI on (just changed X server configuration
> >without reboot)
> > video playback seems to be more stable, but that might be just GPU
> >acceleration
> >
> >3) processor.max_cstate=2, DRI on after cold reboot
> > symptoms like with attempt 1), but powertop returns correct numbers
> >
> >4) processor.max_cstate=1, DRI on
> > in this state I'm writing this e-mail and so far seems to be
> >stable :)
> >
> ><guess>
> > I can just guess what causes these problems... . 1) might seem like
> >improper timer setup after resuming from C3 (at least that
> >would explain
> >that weird powertop numbers).
> >
> > The issue with keyboard needing to be pressed seems more like some
> >race condition, when sometimes the interrupts are not properly enabled
> >-- sometimes it works, sometimes not.
> ></guess>
> >
> > I hope these results will help at least a little. If something other
> >is neccessary, I'll try to do it ASAP.
> >
>
> Were all these tests with 2.6.26? Can you try with 2.6.27-rc3?
>
> There is one bugfix patch that, IIRC, went in after 2.6.26.
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b8f8c3cf0a4ac0632ec3f0e15e9dc0c29de917af
I tried it just now, is performs a bit better than 2.6.26 (e.g. I
don't get that "press any key unless nothing happens" states), even
reports a bit more reasonable values of wakeups, but the system
sometimes becomes rather slow (e.g. when playing video). I was not able
to compile fglrx driver, so I had to change it to radeon one. And also,
the number of wakeups reported is not very convincing, though, it
changes from 300 to 600 (which is ~ two times the sum of wakeups)
without any reason, and sometimes goes even higher.
I tried to use nolapic_timer option, but it didn't help.
>
> Thanks,
> Venki
Thank you,
Milan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists