lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200622141820.7675d07d@why>
Date:   Mon, 22 Jun 2020 14:18:20 +0100
From:   Marc Zyngier <maz@...nel.org>
To:     linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Cc:     bhelgaas@...gle.com, kernel-team@...roid.com
Subject: Re: [PATCH] PCI/IOV: Plug VF bus creation race

On Sun,  7 Jun 2020 10:43:48 +0100
Marc Zyngier <maz@...nel.org> wrote:

Hi Bjorn,

> On a system that creates VFs for multiple PFs in parallel (in
> this case, network bringup at boot time), and when these VFs
> end-up on the same bus, bad things sometimes happen:
> 
> [   12.755534] sysfs: cannot create duplicate filename '/devices/platform/soc/fc000000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/0000:02:01.0/pci_bus/0000:04'
> [   12.755700] pci 0000:04:10.1: [8086:10ca] type 00 class 0x020000
> [   12.763785] CPU: 1 PID: 581 Comm: vfs Tainted: G            E     5.7.0-00033-g002d24ebd695 #1119
> [   12.770402] igb 0000:03:00.1: 1 VFs allocated
> [   12.778493] Hardware name: amlogic w400/w400, BIOS 2020.01-rc5 03/12/2020
> [   12.778496] Call trace:
> [   12.778506]  dump_backtrace+0x0/0x1d0
> [   12.778511]  show_stack+0x20/0x30
> [   12.778516]  dump_stack+0xb8/0x100
> [   12.778520]  sysfs_warn_dup+0x6c/0x88
> [   12.778530]  sysfs_create_dir_ns+0xe8/0x100
> [   12.778535]  kobject_add_internal+0xe0/0x3a0
> [   12.778541]  kobject_add+0x94/0x100
> [   12.817654]  device_add+0x104/0x7b8
> [   12.821100]  device_register+0x28/0x38
> [   12.824810]  pci_add_new_bus+0x1f8/0x488
> [   12.828692]  pci_iov_add_virtfn+0x2c8/0x360
> [   12.832830]  sriov_enable+0x200/0x458
> [   12.836452]  pci_enable_sriov+0x20/0x38
> [   12.840282]  igb_enable_sriov+0x148/0x290 [igb]
> [   12.844745]  igb_pci_sriov_configure+0x40/0x80 [igb]
> [   12.849650]  sriov_numvfs_store+0xb0/0x1a0
> [   12.853703]  dev_attr_store+0x20/0x38
> [   12.857327]  sysfs_kf_write+0x4c/0x60
> [   12.860947]  kernfs_fop_write+0x104/0x220
> [   12.864916]  __vfs_write+0x24/0x50
> [   12.868279]  vfs_write+0xec/0x1d8
> [   12.871556]  ksys_write+0x74/0x100
> [   12.874919]  __arm64_sys_write+0x24/0x30
> [   12.878802]  el0_svc_common.constprop.0+0x7c/0x1f8
> [   12.883544]  do_el0_svc+0x2c/0x98
> [   12.886824]  el0_svc+0x18/0x48
> [   12.889841]  el0_sync_handler+0x120/0x290
> [   12.893808]  el0_sync+0x158/0x180
> [   12.897143] kobject_add_internal failed for 0000:04 with -EEXIST, don't try to register things with the same name in the same directory.
> [   12.897634] igbvf: Intel(R) Gigabit Virtual Function Network Driver - version 2.4.0-k
> 
> It turns out that virtfn_add_bus() doesn't hold any lock, which
> means there is a potential race between checking that the bus
> exists already, and adding it if it doesn't.
> 
> A per-device lock wouldn't help, as this happens when multiple
> PFs insert their respective VFs concurrently.
> 
> Instead, let's introduce new mutex, private to the IOV subsystem,
> that gets taken when dealing with a virtfn bus (either creation
> or destruction). This ensures that these operations get serialized.
> 
> Cc: stable@...r.kernel.org
> Signed-off-by: Marc Zyngier <maz@...nel.org>

Did you have a chance to look into this? I can reliably trigger it on
one of my boxes. Happy to help debugging it further if you think this
hack isn't the right fix.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ