[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200709210035.55083.rjw@sisk.pl>
Date: Fri, 21 Sep 2007 00:35:53 +0200
From: "Rafael J. Wysocki" <rjw@...k.pl>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, Jaroslav Kysela <perex@...e.cz>,
Takashi Iwai <tiwai@...e.de>,
linux-usb-devel@...ts.sourceforge.net,
Venkatesh Pallipadi <venkatesh.pallipadi@...el.com>,
Ingo Molnar <mingo@...e.hu>,
Linus Torvalds <torvalds@...l.org>, miklos@...redi.hu
Subject: Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING
Thomas,
On Thursday, 20 September 2007 23:53, Thomas Gleixner wrote:
> Rafael,
>
> On Thu, 2007-09-20 at 23:45 +0200, Rafael J. Wysocki wrote:
> > > We disable everything in device_suspend()
> >
> > No, we don't. sysdevs are _not_ suspended in device_suspend().
> > They are suspended in device_power_down(), which is called
> > _after_ disable_nonboot_cpus() (from swsusp_suspend()).
> >
> > > including timekeeping,
> >
> > No, the timekeeping is suspended in device_power_down() (or at least it should
> > be).
>
> Damn, you are right. Reading through 30 different logs confused me.
>
> > > enable_nonboot_cpus();
> >
> > Actually, we can't do this here, because of ACPI and some interrupt handling
> > related problems. Unfortunately, platform_finish() needs to go _after_
> > enable_nonboot_cpus() and device_resume() needs to go after platform_finish().
> > Analogously, disable_nonboot_cpus() has to go after platform_prepare().
> >
> > Otherwise, some systems will break.
>
> Well, I don't buy this one. The system would break in the same way, when
> I take CPU#1 offline before I initiate the suspend.
I was referring to the resume part. If we call enable_nonboot_cpus(), which
executes the _INI ACPI control method, after platform_finish(), which executes
the _WAK global ACPI control method, things will break. That already happened
in the past, when the code ordering was different, AFAICS.
> > > and non-surprisingly the "my VAIO needs help from keyboard" problem went
> > > away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
> > > work at all on my VAIO due to some yet not identified wreckage)
> >
> > Hm, I really don't know why it helps, but that's not because of the timekeeping
> > suspend, IMO.
>
> It is related. We rely on some subtle thing which is not up when we
> resume the non boot cpu.
Yes, it looks so.
> > > I did not yet look into the suspend to ram code, but I guess that there
> > > is an equivalent problem.
> >
> > Yes, the code ordering is the same, but it's not totally wrong, IMHO.
> >
> > > But I have no idea why this affects Andrews jinxed VAIO (UP machine),
> > > though I suspect that we have more timekeeping/timer depending code
> > > somewhere waiting to bite us.
> >
> > That's possible.
> >
> > > Also I still need to debug why the HIBERNATION_TEST code path (which has
> > > a msleep(5000) in it) does not fail,
> >
> > See above. :-)
>
> Yes. It makes sense. When I change the TEST code path to:
>
> - printk("swsusp debug: Waiting for 5 seconds.\n");
> - msleep(5000);
> + printk("swsusp debug: before swsusp_suspend\n");
> + error = swsusp_suspend();
>
> then I have the same effect as I get from real hibernation. And we
> actually shut down time keeping somewhere in that code path.
>
> ACPI: PCI interrupt for device 0000:00:1b.0 disabled
> swsusp debug: before swsusp_suspend
> Suspend timekeeping
Exactly. timekeeping_suspend() is called from device_power_down(), which is
called from swsusp_suspend() (after disabling interrupts).
> swsusp: critical section:
> swsusp: Need to copy 112429 pages
> swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
> swsusp: critical section: done (112429 pages copied)
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#0.
> Resume timekeeping
> ACPI: PCI Interrupt 0000:00:02.0[A] -> GSI 16 (level, low) -> IRQ 16
> -> works fine
>
> This is with my patch applied. Without that I get:
>
> CPU1 is down
> swsusp debug: before swsusp_suspend
> Suspend timekeeping
> swsusp: critical section:
> swsusp: Need to copy 112429 pages
> swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
> swsusp: critical section: done (112429 pages copied)
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#0.
> Resume timekeeping
> Enabling non-boot CPUs
> --> Waits for ever until a key is pressed
Well, perhaps there's something else that we should suspend late and resume
early, but we don't?
Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists