[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5038E3B7.5090601@gmail.com>
Date: Sat, 25 Aug 2012 22:39:51 +0800
From: Jiang Liu <liuj97@...il.com>
To: Yijing Wang <wangyijing@...wei.com>
CC: Bjorn Helgaas <bhelgaas@...gle.com>,
PCI <linux-pci@...r.kernel.org>, linux-kernel@...r.kernel.org
Subject: Re: [RESEND BUGFIX PATCH 3/3] PCI: check whether pci device has been
removed when remove a pci device by sysfs
Hi Yijing,
The patch only patially fix the issue, there exists still small race
condition window because pdev->is_added isn't a reliable flag to depend on.
--Gerry
On 08/25/2012 05:59 PM, Yijing Wang wrote:
> We remove a pci device maybe like this
> echo 1 > /sys/bus/pci/devices/xxxx:xx:xx.x/remove
> Then remove_store function will be called to complete this remove work,
> later the remove work will be queued to sysfs_workqueue by device_schedule_callback.
> So if we remove a pci root port device and a pci endpoint device which was the root
> port's child device concurrently.The endponit device will be removed when root port's
> remove work completed,so when endpoint device itself's remove work start, since endpoint
> device has been removed, it will result to oops.
> This patch fix this.
>
> CallTrace:
> kworker/u:2[220]: Oops 11003706212352 [1]
> Modules linked in: cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi
> _cpufreq binfmt_misc fuse nls_iso8859_1 loop ipmi_si ipmi_devintf ipmi_msghandle
> r dm_mod igb ppdev iTCO_wdt parport_pc iTCO_vendor_support i2c_i801 parport sg m
> ptctl serio_raw i2c_core lpc_ich mfd_core hid_generic button container usbhid hi
> d uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif ext3 mbcache jbd fan pr
> ocessor ide_pci_generic ide_core ata_piix libata mptsas mptscsih mptbase scsi_tr
> ansport_sas scsi_mod thermal thermal_sys hwmon
>
> Pid: 220, CPU 30, comm: kworker/u:2
> psr : 0000121008526030 ifs : 8000000000000388 ip : [<a0000001004b3081>] Not
> tainted (3.5.0-rc6yijing-repo)
> ip is at __pci_remove_bus_device+0x101/0x1e0
> unat: 0000000000000000 pfs : 0000000000000388 rsc : 0000000000000003
> rnat: ffffffffffffffff bsps: ffffffffffffffff pr : 0000080001919585
> ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c9e70433f
> csd : 0000000000000000 ssd : 0000000000000000
> b0 : a0000001004b3060 b6 : a0000001004c2400 b7 : a0000001000faae0
> f6 : 000000000000000000000 f7 : 1003e00000000000057cd
> f8 : 1003e0000000050000003 f9 : 1003e000001cb8678a0d0
> f10 : 1003e9a05b7a39369e270 f11 : 1003e000000000000008f
> r1 : a0000001014e63c0 r2 : e000001f075dec00 r3 : 0000000000000000
> r8 : 0000000000000008 r9 : a0000001012e7308 r10 : 0000000004000000
> r11 : e000000f0006e800 r12 : e000001f08dbfe00 r13 : e000001f08db0000
> r14 : 0000000000000000 r15 : 0000000000000000 r16 : 0000000000000000
> r17 : e000000f0006f008 r18 : 000000000f000000 r19 : a0000001012f3910
> r20 : 0000000000100001 r21 : a000000101a62990 r22 : a000000100344580
> r23 : 0000000000000000 r24 : 0000000000001000 r25 : 0000000000000000
> r26 : a000000101a62988 r27 : e000003f0fc37e60 r28 : e000003f0fc37e68
> r29 : e000002f07012be0 r30 : 0000000082aa0260 r31 : 0000000000004000
>
> Call Trace:
> [<a000000100016500>] show_stack+0x80/0xa0
> sp=e000001f08dbf9c0 bsp=e000001f08db1388
> [<a000000100016b60>] show_regs+0x640/0x920
> sp=e000001f08dbfb90 bsp=e000001f08db1330
> [<a000000100040770>] die+0x190/0x2c0
> sp=e000001f08dbfba0 bsp=e000001f08db12f0
> [<a000000100908f60>] ia64_do_page_fault+0x7e0/0xac0
> sp=e000001f08dbfba0 bsp=e000001f08db1290
> [<a00000010000c0a0>] ia64_native_leave_kernel+0x0/0x270
> sp=e000001f08dbfc30 bsp=e000001f08db1290
> [<a0000001004b3080>] __pci_remove_bus_device+0x100/0x1e0
> sp=e000001f08dbfe00 bsp=e000001f08db1250
> [<a0000001004b32f0>] pci_stop_and_remove_bus_device+0x30/0x60
> sp=e000001f08dbfe00 bsp=e000001f08db1230
> [<a0000001004c2440>] remove_callback+0x40/0x80
> sp=e000001f08dbfe00 bsp=e000001f08db1208
> [<a0000001003445d0>] sysfs_schedule_callback_work+0x50/0x120
> sp=e000001f08dbfe00 bsp=e000001f08db11d0
> [<a0000001000bc2d0>] process_one_work+0x6f0/0xae0
> sp=e000001f08dbfe00 bsp=e000001f08db1158
> [<a0000001000bcf70>] worker_thread+0x3b0/0xc80
> sp=e000001f08dbfe00 bsp=e000001f08db1060
> [<a0000001000cf050>] kthread+0x110/0x140
> sp=e000001f08dbfe00 bsp=e000001f08db1028
> [<a000000100014590>] kernel_thread_helper+0x30/0x60
> sp=e000001f08dbfe30 bsp=e000001f08db1000
> [<a00000010000a0c0>] start_kernel_thread+0x20/0x40
> sp=e000001f08dbfe30 bsp=e000001f08db1000
> Disabling lock debugging due to kernel taint
> Unable to handle kernel NULL pointer dereference (address 0000000000000048)
> kworker/u:2[220]: Oops 11012296146944 [2]
>
> Pid: 220, CPU 30, comm: kworker/u:2
> psr : 0000121008022038 ifs : 8000000000000288 ip : [<a0000001000c4961>] Tain
> ted: G D (3.5.0-rc6yijing-repo)
> ip is at wq_worker_sleeping+0x61/0x200
> unat: 0000000000000000 pfs : 0000000000000288 rsc : 0000000000000003
> rnat: 0000121008026038 bsps: a0000001000407e0 pr : 965a684515516955
> ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
> csd : 0000000000000000 ssd : 0000000000000000
> b0 : a0000001000c4920 b6 : a0000001000f9fc0 b7 : a0000001000faae0
> f6 : 000000000000000000000 f7 : 1003e9e3779b97f4a7c16
> f8 : 1003e0000000050000003 f9 : 1003e000001cb87e8a5a8
> f10 : 1003e9a78b92717b9f0f8 f11 : 1003e000000000000008f
> r1 : a0000001014e63c0 r2 : 0000000000000000 r3 : fffffffffffc1200
> r8 : 0000000000000000 r9 : 000000000000001e r10 : a000000101432530
> r11 : a000000101432530 r12 : e000001f08dbfb70 r13 : e000001f08db0000
> r14 : 0000000000001000 r15 : a000000101432620 r16 : e000003000245d40
> r17 : fffffffffffc5c00 r18 : e000003000245d00 r19 : 00000000000000f8
> r20 : e000001f08db0070 r21 : 0000000000000048 r22 : e000003000245ce8
> r23 : e000003000245ce0 r24 : a000000101a638e0 r25 : ffffffffff48e500
> r26 : e000003f088a0098 r27 : 0000000000000400 r28 : 0000000000000001
> r29 : 000000000420806c r30 : e000001f08db0014 r31 : 0000000000000000
>
> Call Trace:
> [<a000000100016500>] show_stack+0x80/0xa0
> sp=e000001f08dbf730 bsp=e000001f08db16f8
> [<a000000100016b60>] show_regs+0x640/0x920
> sp=e000001f08dbf900 bsp=e000001f08db16a0
> [<a000000100040770>] die+0x190/0x2c0
> sp=e000001f08dbf910 bsp=e000001f08db1660
> [<a000000100908f60>] ia64_do_page_fault+0x7e0/0xac0
> sp=e000001f08dbf910 bsp=e000001f08db1600
> [<a00000010000c0a0>] ia64_native_leave_kernel+0x0/0x270
> sp=e000001f08dbf9a0 bsp=e000001f08db1600
> [<a0000001000c4960>] wq_worker_sleeping+0x60/0x200
> sp=e000001f08dbfb70 bsp=e000001f08db15b8
> [<a0000001009007e0>] __schedule+0x14c0/0x18c0
> sp=e000001f08dbfb70 bsp=e000001f08db1440
> [<a000000100900ea0>] schedule+0x60/0x140
> sp=e000001f08dbfb80 bsp=e000001f08db13e0
> [<a000000100090d10>] do_exit+0xef0/0x1740
> sp=e000001f08dbfb80 bsp=e000001f08db1330
> [<a000000100040840>] die+0x260/0x2c0
> sp=e000001f08dbfba0 bsp=e000001f08db12f0
> [<a000000100908f60>] ia64_do_page_fault+0x7e0/0xac0
> sp=e000001f08dbfba0 bsp=e000001f08db1290
> [<a00000010000c0a0>] ia64_native_leave_kernel+0x0/0x270
> sp=e000001f08dbfc30 bsp=e000001f08db1290
> [<a0000001004b3080>] __pci_remove_bus_device+0x100/0x1e0
> sp=e000001f08dbfe00 bsp=e000001f08db1250
> [<a0000001004b32f0>] pci_stop_and_remove_bus_device+0x30/0x60
> sp=e000001f08dbfe00 bsp=e000001f08db1230
> [<a0000001004c2440>] remove_callback+0x40/0x80
> sp=e000001f08dbfe00 bsp=e000001f08db1208
> [<a0000001003445d0>] sysfs_schedule_callback_work+0x50/0x120
> sp=e000001f08dbfe00 bsp=e000001f08db11d0
> [<a0000001000bc2d0>] process_one_work+0x6f0/0xae0
> sp=e000001f08dbfe00 bsp=e000001f08db1158
> [<a0000001000bcf70>] worker_thread+0x3b0/0xc80
> sp=e000001f08dbfe00 bsp=e000001f08db1060
> [<a0000001000cf050>] kthread+0x110/0x140
> sp=e000001f08dbfe00 bsp=e000001f08db1028
> [<a000000100014590>] kernel_thread_helper+0x30/0x60
> sp=e000001f08dbfe30 bsp=e000001f08db1000
> [<a00000010000a0c0>] start_kernel_thread+0x20/0x40
> sp=e000001f08dbfe30 bsp=e000001f08db1000
> Fixing recursive fault but reboot is needed!
> Modules linked in: cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi
> _cpufreq binfmt_misc fuse nls_iso8859_1 loop ipmi_si ipmi_devintf ipmi_msghandle
> r dm_mod igb ppdev iTCO_wdt parport_pc iTCO_vendor_support i2c_i801 parport sg m
> ptctl serio_raw i2c_core lpc_ich mfd_core hid_generic button container usbhid hi
> d uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif ext3 mbcache jbd fan pr
> ocessor ide_pci_generic ide_core ata_piix libata mptsas mptscsih mptbase scsi_tr
> ansport_sas scsi_mod thermal thermal_sys hwmon
>
> Signed-off-by: Yijing Wang <wangyijing@...wei.com>
> ---
> drivers/pci/pci-sysfs.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 6869009..b0be682 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -332,7 +332,10 @@ static void remove_callback(struct device *dev)
> struct pci_dev *pdev = to_pci_dev(dev);
>
> mutex_lock(&pci_remove_rescan_mutex);
> + if (!pdev->is_added)
> + goto out;
> pci_stop_and_remove_bus_device(pdev);
> +out:
> mutex_unlock(&pci_remove_rescan_mutex);
> }
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists