[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AE947D3.5070500@jp.fujitsu.com>
Date: Thu, 29 Oct 2009 16:44:19 +0900
From: Kenji Kaneshige <kaneshige.kenji@...fujitsu.com>
To: Jens Axboe <jens.axboe@...cle.com>
CC: Alex Chiang <achiang@...com>, Mark Lord <lkml@....ca>,
Greg KH <greg@...ah.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
jbarnes@...tuousgeek.org, linux-pci@...r.kernel.org
Subject: Re: pci-express hotplug
Jens Axboe wrote:
> On Wed, Oct 28 2009, Kenji Kaneshige wrote:
>> Jens Axboe wrote:
>>> On Tue, Oct 27 2009, Kenji Kaneshige wrote:
>>>> Jens Axboe wrote:
>>>>> On Tue, Oct 20 2009, Alex Chiang wrote:
>>>>>> * Jens Axboe <jens.axboe@...cle.com>:
>>>>>>> On Tue, Oct 13 2009, Alex Chiang wrote:
>>>>>>>>>> Can you modprobe acpiphp with debug=1? And send the output?
>>>>>>>>> acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
>>>>>>>>> acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:00:05.0
>>>>>>>>> acpiphp_glue: found ACPI PCI Hotplug slot 1 at PCI 0000:08:00
>>>>>>>>> acpiphp: Slot [1] registered
>>>>>>>>> acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:00:07.0
>>>>>>>>> acpiphp_glue: found ACPI PCI Hotplug slot 2 at PCI 0000:0b:00
>>>>>>>>> acpiphp: Slot [2] registered
>>>>>>>>> acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:80:07.0
>>>>>>>>> acpiphp_glue: found ACPI PCI Hotplug slot 6 at PCI 0000:84:00
>>>>>>>>> acpiphp: Slot [6] registered
>>>>>>>>> acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:80:09.0
>>>>>>>>> acpiphp_glue: found ACPI PCI Hotplug slot 7 at PCI 0000:87:00
>>>>>>>>> acpiphp: Slot [7] registered
>>>>>>>>> acpiphp_glue: Bus 0000:87 has 1 slot
>>>>>>>>> acpiphp_glue: Bus 0000:84 has 1 slot
>>>>>>>>> acpiphp_glue: Bus 0000:0b has 1 slot
>>>>>>>>> acpiphp_glue: Bus 0000:08 has 1 slot
>>>>>>>>> acpiphp_glue: Total 4 slots
>>>>>>>> You mentioned in another mail that you echoed 1 into the various
>>>>>>>> slots' power files.
>>>>>>>>
>>>>>>>> Did you do that after modprobing acpiphp with debug=1?
>>>>>>>>
>>>>>>>> If so, there should be debug output when you try and turn them
>>>>>>>> on.
>>>>>>> It produces:
>>>>>>>
>>>>>>> acpiphp: enable_slot - physical_slot = 1
>>>>>>> acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
>>>>>>> acpiphp: enable_slot - physical_slot = 2
>>>>>>> acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
>>>>>>> acpiphp: enable_slot - physical_slot = 6
>>>>>>> acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
>>>>>>> acpiphp: enable_slot - physical_slot = 7
>>>>>>> acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
>>>>>> Hm, so for some reason, firmware on your machine is telling us
>>>>>> that it doesn't think cards are present and/or enabled.
>>>>>>
>>>>>> Unfortunately, I don't know why your firmware would be saying
>>>>>> that. We could add some more debug printks to see what firmware
>>>>>> thinks about your system... Or we could just wait and see what
>>>>>> happens after you get your hardware replaced.
>>>>> New board, the exact same thing happens.
>>>>>
>>>>>>> I have a card in one of the slots only this time.
>>>>>>>
>>>>>>>> Also, quick dummy check, you are trying to power on populated
>>>>>>>> slots, right? :)
>>>>>>> Yes :-)
>>>>>>>
>>>>>>>> Can you send the output of lspci -vv? And I like the output of
>>>>>>>> lspci -vt as well... Both before and after loading acpiphp
>>>>>>>> please.
>>>>>>> Send privately.
>>>>>> No difference in before and after. Odd.
>>>>>>
>>>>>> If you want to poke us again after your hardware swap, please do
>>>>>> so. Sorry for being not so helpful. :-/
>>>>> Poke :-)
>>>>>
>>>>> One more thing I tried was pushing the power button on the slot
>>>>> manually. With acpiphp, I get the same messages as above. Using pciehp,
>>>>> I get the same power fault bit interrupt storm. So no difference from
>>>>> using the sysfs interface or doing it on the box side, doesn't work
>>>>> either way.
>>>>>
>>>> I'd like to confirm power fault interrupt storm, just in case.
>>>> Could you get /proc/interrupts information after power fault
>>>> problem happens and send it to me?
>>> The box pretty much hangs when I try to power on a slot with pciehp, so
>>> it's not easy to do... It doesn't hang with acpiphp, but doesn't work
>>> either (see previous reply to Alex).
>>>
>> Could you try the attached debugging patch? With this patch, power
>> fault interrupt would be disabled after 100 power fault detected (
>> I hope so). You can get /proc/interrupts after that.
>
> Here is the output of doing the power on with that patch applied.
>
> pciehp 0000:00:05.0:pcie04: enable_slot: physical_slot = 1
> pciehp 0000:00:05.0:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 77b
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 10
> pciehp 0000:00:05.0:pcie04: pciehp_power_on_slot: SLOTCTRL a8 write cmd 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 10
> pciehp 0000:00:05.0:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: Power fault interrupt received
> pciehp 0000:00:05.0:pcie04: Power fault on Slot(1)
> pciehp 0000:00:05.0:pcie04: Power fault bit 0 set
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
> pciehp 0000:00:05.0:pcie04: Data Link Layer Link Active not set in 1000 msec
> pciehp 0000:00:05.0:pcie04: pciehp_check_link_status: lnk_status = 1001
> pciehp 0000:00:05.0:pcie04: Link Training Error occurs
> pciehp 0000:00:05.0:pcie04: Failed to check link status
> pciehp 0000:00:05.0:pcie04: Command not completed in 1000 msec
> pciehp 0000:00:05.0:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 12
> pciehp 0000:00:05.0:pcie04: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 12
> pciehp 0000:00:05.0:pcie04: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
> pciehp 0000:00:05.0:pcie04: Command not completed in 1000 msec
> pciehp 0000:00:05.0:pcie04: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
> pciehp 0000:00:05.0:pcie04: Command not completed in 1000 msec
> pciehp 0000:00:05.0:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
> pciehp 0000:00:05.0:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 779
> pciehp 0000:00:05.0:pcie04: pciehp_get_attention_status: SLOTCTRL a8, value read 779
>
>From the console log, it seems that my debug patch worked as I expected
(power fault event interrupts ware disabled after 100 power fault event).
But for some reasons, /proc/interrupts indicates only 5 interrupts of
pciehp. Just in case, did you get /proc/interrupts after doing power on?
Thanks,
Kenji Kaneshige
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists