[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOLK0pySsrXEjbLor0v3zhbtUGx_437d0r5WAxWnufzZ+QwpCQ@mail.gmail.com>
Date: Thu, 5 Sep 2013 14:17:06 +0800
From: Lan Tianyu <lantianyu1986@...il.com>
To: Alex Williamson <alex.williamson@...hat.com>
Cc: "Rafael J. Wysocki" <rjw@...k.pl>,
ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
LKML <linux-kernel@...r.kernel.org>,
Linux PCI <linux-pci@...r.kernel.org>,
Yinghai Lu <yinghai@...nel.org>, Jiang Liu <liuj97@...il.com>,
Mika Westerberg <mika.westerberg@...ux.intel.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH 25/30] ACPI / hotplug / PCI: Check for new devices on
enabled slots
2013/9/5 Alex Williamson <alex.williamson@...hat.com>:
> On Thu, 2013-09-05 at 01:35 +0200, Rafael J. Wysocki wrote:
>> On Wednesday, September 04, 2013 05:12:14 PM Alex Williamson wrote:
>> > On Thu, 2013-09-05 at 00:54 +0200, Rafael J. Wysocki wrote:
>> > > On Wednesday, September 04, 2013 02:36:34 PM Alex Williamson wrote:
>> > > > On Thu, 2013-07-18 at 01:32 +0200, Rafael J. Wysocki wrote:
>> > > > > From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>> > > > >
>> > > > > The current implementation of acpiphp_check_bridge() is pretty dumb:
>> > > > > - It enables a slot if it's not enabled and the slot status is
>> > > > > ACPI_STA_ALL.
>> > > > > - It disables a slot if it's enabled and the slot status is not
>> > > > > ACPI_STA_ALL.
>> > > > >
>> > > > > This behavior is not sufficient to handle the Thunderbolt daisy
>> > > > > chaining case properly, however, because in that case the bus
>> > > > > behind the already enabled slot needs to be rescanned for new
>> > > > > devices.
>> > > > >
>> > > > > For this reason, modify acpiphp_check_bridge() so that slots are
>> > > > > disabled and stopped if they are not in the ACPI_STA_ALL state.
>> > > > >
>> > > > > For slots in the ACPI_STA_ALL state, devices behind them that don't
>> > > > > respond are trimmed using a new function, trim_stale_devices(),
>> > > > > introduced specifically for this purpose. That function walks
>> > > > > the given bus and checks each device on it. If the device doesn't
>> > > > > respond, it is assumed to be gone and is removed.
>> > > > >
>> > > > > Once all of the stale devices directy behind the slot have been
>> > > > > removed, acpiphp_check_bridge() will start looking for new devices
>> > > > > that might have appeared on the given bus. It will do that even if
>> > > > > the slot is already enabled (SLOT_ENABLED is set for it).
>> > > > >
>> > > > > In addition to that, make the bus check notification ignore
>> > > > > SLOT_ENABLED and go for enable_device() directly if bridge is NULL,
>> > > > > so that devices behind the slot are re-enumerated in that case too.
>> > > > >
>> > > > > This change is based on earlier patches from Kirill A Shutemov
>> > > > > and Mika Westerberg.
>> > > > >
>> > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>> > > > > Tested-by: Mika Westerberg <mika.westerberg@...ux.intel.com>
>> > > > > ---
>> > > >
>> > > > FYI, git bisect landed on this patch as the cause of my serial console
>> > > > dying on current upstream. Further debugging to come... Thanks,
>> > >
>> > > Well, sorry about that.
>> > >
>> > > What exactly do you mean by "dying"?
>> >
>> > Sorry, I was hoping to have more details quickly, but it's been a pain
>> > to debug. By dying I mean serial console output suddenly stops during
>> > kernel boot and nothing more comes out of it until after the system is
>> > rebooted. The problem happens when acpiphp_check_bridge() calls
>> > enable_slot(). The serial console dies somewhere down in
>> > acpiphp_bus_trim(). I think this is happening on the 00:1f ISA bridge,
>> > so there's a good chance the serial ports are described as somewhere
>> > under there.
>>
>> Can you please check if that is the acpiphp_bus_trim() called by
>> acpiphp_bus_add() or the other one called from trim_stale_devices()?
>>
>> Just add a dump_stack() or WARN_ON(1) to trim_stale_devices() next to
>> the acpiphp_bus_trim() call and see if that triggers. I *think* it's the one
>> in acpiphp_bus_add(), but it won't hurt to verify that.
>
> Here's the call path:
>
> [ 16.120824] [<ffffffff81627e6c>] dump_stack+0x55/0x76
> [ 16.125979] [<ffffffff8162132e>] enable_slot+0x4ee/0x5e0
> [ 16.131396] [<ffffffff813418fb>] ? trim_stale_devices+0x5b/0xf0
> [ 16.137420] [<ffffffff81341b35>] acpiphp_check_bridge+0xd5/0x110
> [ 16.143531] [<ffffffff81342acb>] hotplug_event+0x16b/0x260
> [ 16.149115] [<ffffffff81072cd9>] ? process_one_work+0x189/0x540
> [ 16.155136] [<ffffffff81342bf0>] hotplug_event_work+0x30/0x70
> [ 16.160978] [<ffffffff81072d3b>] process_one_work+0x1eb/0x540
> [ 16.166819] [<ffffffff81072cd9>] ? process_one_work+0x189/0x540
> [ 16.172836] [<ffffffff8107353c>] worker_thread+0x11c/0x370
> [ 16.178426] [<ffffffff81073420>] ? rescuer_thread+0x350/0x350
> [ 16.184276] [<ffffffff8107b0ea>] kthread+0xea/0xf0
> [ 16.189165] [<ffffffff8107b000>] ? kthread_create_on_node+0x160/0x160
> [ 16.195700] [<ffffffff816395dc>] ret_from_fork+0x7c/0xb0
> [ 16.201109] [<ffffffff8107b000>] ? kthread_create_on_node+0x160/0x160
>
> The actual death of the serial console occurs in acpi_device_set_power()
> called from:
>
> enable_slot()
> acpiphp_bus_add()
> acpiphp_bus_trim()
> acpi_bus_trim()
> acpi_walk_namespace()
> acpi_bus_remove()
> acpi_device_unregister()
> acpi_device_set_power()
>
> I can't seem to get a path from the acpi devices in question there, so I
> have no idea what's getting trimmed here. It worries me quite a bit by
> introducing this trimming that apparently wasn't happening before
> though. Thanks,
Hi Alex:
Could you apply the following patch and bootup with kernel param
"acpiphp.acpiphp_debug=1"?
I guess the patch can make serial port alive. It will not
be put into D3cold
during trimming. But I don't know why it doesn't work after being put
back to D0.
So please attach output of acpidump and the dmesg if it can work. Thanks.
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index e763651..359b23d 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -1110,7 +1110,7 @@ static void acpi_device_unregister(struct
acpi_device *device)
* power resources the device depends on and turn off the ones that have
* no more references.
*/
- acpi_device_set_power(device, ACPI_STATE_D3_COLD);
+ //acpi_device_set_power(device, ACPI_STATE_D3_COLD);
device->handle = NULL;
put_device(&device->dev);
}
>
> Alex
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Best regards
Tianyu Lan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists