lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <516AA80F.7040505@mellanox.com>
Date:	Sun, 14 Apr 2013 15:58:55 +0300
From:	Or Gerlitz <ogerlitz@...lanox.com>
To:	Tejun Heo <tj@...nel.org>
CC:	"Michael S. Tsirkin" <mst@...hat.com>,
	Ming Lei <ming.lei@...onical.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	David Miller <davem@...emloft.net>,
	Roland Dreier <roland@...nel.org>,
	netdev <netdev@...r.kernel.org>, Yan Burman <yanb@...lanox.com>,
	Jack Morgenstein <jackm@....mellanox.co.il>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	<linux-pci@...r.kernel.org>
Subject: Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV
 probes

On 11/04/2013 23:41, Tejun Heo wrote:
> Hello,
>
> On Thu, Apr 11, 2013 at 11:30:53PM +0300, Michael S. Tsirkin wrote:
>> Okay, so you are saying it's a false-positive?
> Yeah, I think so.  It didn't actually lock up, right?  It it did,
> our analysis upto this point is likely to be completely wrong.
>
>> Want to send a patch so Or can try it out?
> Hmmm... something like the following on the workqueue side (completely untested).
>
> diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
> index 8afab27..899d470 100644
> --- a/include/linux/workqueue.h
> +++ b/include/linux/workqueue.h
> @@ -466,14 +466,21 @@ static inline bool __deprecated flush_delayed_work_sync(struct delayed_work *dwo
>   }
>   
>   #ifndef CONFIG_SMP
> -static inline long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
> +static inline long work_on_cpu_nested(unsigned int cpu, long (*fn)(void *),
> +				      void *arg, int subclass)
>   {
>   	return fn(arg);
>   }
>   #else
> -long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg);
> +long work_on_cpu_nested(unsigned int cpu, long (*fn)(void *), void *arg,
> +			int subclass);
>   #endif /* CONFIG_SMP */
>   
> +static inline long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
> +{
> +	return work_on_cpu_nested(cpu, fn, arg, 0);
> +}
> +
>   #ifdef CONFIG_FREEZER
>   extern void freeze_workqueues_begin(void);
>   extern bool freeze_workqueues_busy(void);
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 81f2457..c2be670 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3555,25 +3555,30 @@ static void work_for_cpu_fn(struct work_struct *work)
>   }
>   
>   /**
> - * work_on_cpu - run a function in user context on a particular cpu
> + * work_on_cpu_nested - run a function in user context on a particular cpu
>    * @cpu: the cpu to run on
>    * @fn: the function to run
>    * @arg: the function arg
> + * @subclass: lockdep subclass
>    *
>    * This will return the value @fn returns.
>    * It is up to the caller to ensure that the cpu doesn't go offline.
>    * The caller must not hold any locks which would prevent @fn from completing.
> + *
> + * XXX: explain @subclass
>    */
> -long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
> +long work_on_cpu_nested(unsigned int cpu, long (*fn)(void *), void *arg,
> +			int subclass)
>   {
>   	struct work_for_cpu wfc = { .fn = fn, .arg = arg };
>   
>   	INIT_WORK_ONSTACK(&wfc.work, work_for_cpu_fn);
> +	lock_set_subclass(&wfc.work.lockdep_map, subclass, _RET_IP_);
>   	schedule_work_on(cpu, &wfc.work);
>   	flush_work(&wfc.work);
>   	return wfc.ret;
>   }
> -EXPORT_SYMBOL_GPL(work_on_cpu);
> +EXPORT_SYMBOL_GPL(work_on_cpu_nested);
>   #endif /* CONFIG_SMP */
>   
>   #ifdef CONFIG_FREEZER

Hi,

So the patch eliminated the lockdep warning for mlx4 nested probing 
sequence, but introduced lockdep warning for
00:13.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub I/OxAPIC 
Interrupt Controller (rev 22)

... see below the lockdep output and the lspci listings, attached is the 
full boot sequence dmesg
and my .config -  this I was running against the net git as of commit 
2e0cbf2cc2c9371f0aa198857d799175ffe231a6
"net: mvmdio: add select PHYLIB"

 From quick tests - the system is operative with the patch as it was 
without it - e.g mlx4 VFs probed on the host
is working OK and also the 1g Intel NIC.


We have holiday here Mon/Tues, so I will be able to test further patches Wed

Or.




=====================================
[ BUG: bad unlock balance detected! ]
3.9.0-rc6+ #53 Not tainted
-------------------------------------
swapper/0/1 is trying to release lock ((&wfc.work)) at:
[<ffffffff8122014c>] pci_device_probe+0xfc/0x120
but there are no more locks to release!

other info that might help us debug this:
2 locks held by swapper/0/1:
  #0:  (&__lockdep_no_validate__){......}, at: [<ffffffff812da443>] 
__driver_attach+0x53/0xb0
  #1:  (&__lockdep_no_validate__){......}, at: [<ffffffff812da451>] 
__driver_attach+0x61/0xb0

stack backtrace:
Pid: 1, comm: swapper/0 Not tainted 3.9.0-rc6+ #53
Call Trace:
[<ffffffff8122014c>] ? pci_device_probe+0xfc/0x120
  [<ffffffff81093529>] print_unlock_imbalance_bug+0xf9/0x100
  [<ffffffff8109616f>] lock_set_class+0x27f/0x7c0
  [<ffffffff81091d9e>] ? mark_held_locks+0x9e/0x130
  [<ffffffff8122014c>] ? pci_device_probe+0xfc/0x120
  [<ffffffff81066aeb>] work_on_cpu_nested+0x8b/0xc0
  [<ffffffff810633c0>] ? keventd_up+0x20/0x20
  [<ffffffff8121f420>] ? pci_pm_prepare+0x60/0x60
  [<ffffffff8122014c>] pci_device_probe+0xfc/0x120
  [<ffffffff812da0fa>] ? driver_sysfs_add+0x7a/0xb0
  [<ffffffff812da24f>] driver_probe_device+0x8f/0x230
  [<ffffffff812da493>] __driver_attach+0xa3/0xb0
  [<ffffffff812da3f0>] ? driver_probe_device+0x230/0x230
  [<ffffffff812da3f0>] ? driver_probe_device+0x230/0x230
  [<ffffffff812d86fc>] bus_for_each_dev+0x8c/0xb0
  [<ffffffff812da079>] driver_attach+0x19/0x20
  [<ffffffff812d91a0>] bus_add_driver+0x1f0/0x250
  [<ffffffff818bd596>] ? dmi_pcie_pme_disable_msi+0x21/0x21
  [<ffffffff812daadf>] driver_register+0x6f/0x150
  [<ffffffff818bd596>] ? dmi_pcie_pme_disable_msi+0x21/0x21
  [<ffffffff8122026f>] __pci_register_driver+0x5f/0x70
  [<ffffffff818bd5ff>] pcie_portdrv_init+0x69/0x7a
  [<ffffffff810001fd>] do_one_initcall+0x3d/0x170
  [<ffffffff81895943>] kernel_init_freeable+0x10d/0x19c
  [<ffffffff818959d2>] ? kernel_init_freeable+0x19c/0x19c
  [<ffffffff8145a040>] ? rest_init+0x160/0x160
  [<ffffffff8145a049>] kernel_init+0x9/0xf0
  [<ffffffff8146ca6c>] ret_from_fork+0x7c/0xb0
  [<ffffffff8145a040>] ? rest_init+0x160/0x160
ioapic: probe of 0000:00:13.0 failed with error -22
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
intel_idle: MWAIT substates: 0x1120
intel_idle: v0.4 model 0x2C
intel_idle: lapic_timer_reliable_states 0xffffffff
ACPI: Requesting acpi_cpufreq
ERST: Failed to get Error Log Address Range.

# lspci
00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 22)
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express 
Root Port 1 (rev 22)
00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express 
Root Port 3 (rev 22)
00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root 
Port 5 (rev 22)
00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express 
Root Port 7 (rev 22)
00:09.0 PCI bridge: Intel Corporation 7500/5520/5500/X58 I/O Hub PCI 
Express Root Port 9 (rev 22)
00:13.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub I/OxAPIC 
Interrupt Controller (rev 22)
00:14.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub System 
Management Registers (rev 22)
00:14.1 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub GPIO and 
Scratch Pad Registers (rev 22)
00:14.2 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub Control Status 
and RAS Registers (rev 22)
00:14.3 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub Throttle 
Registers (rev 22)
00:16.0 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22)
00:16.1 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22)
00:16.2 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22)
00:16.3 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22)
00:16.4 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22)
00:16.5 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22)
00:16.6 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22)
00:16.7 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22)
00:1a.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #4
00:1a.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #5
00:1a.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #6
00:1a.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 
EHCI Controller #2
00:1d.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #1
00:1d.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #2
00:1d.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #3
00:1d.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 
EHCI Controller #1
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)
00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface 
Controller
00:1f.2 IDE interface: Intel Corporation 82801JI (ICH10 Family) 4 port 
SATA IDE Controller #1
00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller
00:1f.5 IDE interface: Intel Corporation 82801JI (ICH10 Family) 2 port 
SATA IDE Controller #2
01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
04:00.0 Network controller: Mellanox Technologies MT27500 Family 
[ConnectX-3]
04:00.1 Network controller: Mellanox Technologies MT27500 Family 
[ConnectX-3 Virtual Function]
04:00.2 Network controller: Mellanox Technologies MT27500 Family 
[ConnectX-3 Virtual Function]
04:00.3 Network controller: Mellanox Technologies MT27500 Family 
[ConnectX-3 Virtual Function]
05:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
05:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
05:10.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller 
Virtual Function (rev 01)
05:10.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller 
Virtual Function (rev 01)
07:01.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA 
G200eW WPCM450 (rev 0a)
fe:00.0 Host bridge: Intel Corporation Xeon 5600 Series QuickPath 
Architecture Generic Non-core Registers (rev 02)
fe:00.1 Host bridge: Intel Corporation Xeon 5600 Series QuickPath 
Architecture System Address Decoder (rev 02)
fe:02.0 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 0 (rev 02)
fe:02.1 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 0 
(rev 02)
fe:02.2 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 
0 (rev 02)
fe:02.3 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 
1 (rev 02)
fe:02.4 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 1 (rev 02)
fe:02.5 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 1 
(rev 02)
fe:03.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Registers (rev 02)
fe:03.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Target Address Decoder (rev 02)
fe:03.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller RAS Registers (rev 02)
fe:03.4 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Test Registers (rev 02)
fe:04.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 0 Control (rev 02)
fe:04.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 0 Address (rev 02)
fe:04.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 0 Rank (rev 02)
fe:04.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 0 Thermal Control (rev 02)
fe:05.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 1 Control (rev 02)
fe:05.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 1 Address (rev 02)
fe:05.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 1 Rank (rev 02)
fe:05.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 1 Thermal Control (rev 02)
fe:06.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 2 Control (rev 02)
fe:06.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 2 Address (rev 02)
fe:06.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 2 Rank (rev 02)
fe:06.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 2 Thermal Control (rev 02)
ff:00.0 Host bridge: Intel Corporation Xeon 5600 Series QuickPath 
Architecture Generic Non-core Registers (rev 02)
ff:00.1 Host bridge: Intel Corporation Xeon 5600 Series QuickPath 
Architecture System Address Decoder (rev 02)
ff:02.0 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 0 (rev 02)
ff:02.1 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 0 
(rev 02)
ff:02.2 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 
0 (rev 02)
ff:02.3 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 
1 (rev 02)
ff:02.4 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 1 (rev 02)
ff:02.5 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 1 
(rev 02)
ff:03.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Registers (rev 02)
ff:03.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Target Address Decoder (rev 02)
ff:03.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller RAS Registers (rev 02)
ff:03.4 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Test Registers (rev 02)
ff:04.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 0 Control (rev 02)
ff:04.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 0 Address (rev 02)
ff:04.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 0 Rank (rev 02)
ff:04.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 0 Thermal Control (rev 02)
ff:05.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 1 Control (rev 02)
ff:05.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 1 Address (rev 02)
ff:05.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 1 Rank (rev 02)
ff:05.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 1 Thermal Control (rev 02)
ff:06.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 2 Control (rev 02)
ff:06.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 2 Address (rev 02)
ff:06.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 2 Rank (rev 02)
ff:06.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated 
Memory Controller Channel 2 Thermal Control (rev 02)



View attachment "dmesg-net-2e0cbf2-tejun-patched" of type "text/plain" (71083 bytes)

View attachment "config-net-2e0cbf2" of type "text/plain" (77585 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ