lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAE9FiQW07ZAU_0bjkrwK54TAn48ByyS=kXNFwQJ-G2FFkK_W3A@mail.gmail.com>
Date:	Mon, 18 Nov 2013 11:29:32 -0800
From:	Yinghai Lu <yinghai@...nel.org>
To:	Bjorn Helgaas <bhelgaas@...gle.com>
Cc:	Tejun Heo <tj@...nel.org>, Hugh Dickins <hughd@...gle.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Li Zefan <lizefan@...wei.com>,
	Markus Blank-Burian <burian@...nster.de>,
	Michal Hocko <mhocko@...e.cz>,
	Johannes Weiner <hannes@...xchg.org>,
	David Rientjes <rientjes@...gle.com>,
	Ying Han <yinghan@...gle.com>,
	Greg Thelen <gthelen@...gle.com>,
	Michel Lespinasse <walken@...gle.com>, cgroups@...r.kernel.org,
	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Alexander Duyck <alexander.h.duyck@...el.com>,
	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>
Subject: Re: Possible regression with cgroups in 3.11

On Mon, Nov 18, 2013 at 10:14 AM, Bjorn Helgaas <bhelgaas@...gle.com> wrote:
>> A bit of comment here would be nice but yeah I think this should work.
>> Can you please also queue the revert of c2fda509667b ("workqueue:
>> allow work_on_cpu() to be called recursively") after this patch?
>> Please feel free to add my acked-by.
>
> OK, below are the two patches (Alex's fix + the revert) I propose to
> merge.  Unless there are objections, I'll ask Linus to pull these
> before v3.13-rc1.
>
>
>
> commit 84f23f99b507c2c9247f47d3db0f71a3fd65e3a3
> Author: Alexander Duyck <alexander.h.duyck@...el.com>
> Date:   Mon Nov 18 10:59:59 2013 -0700
>
>     PCI: Avoid unnecessary CPU switch when calling driver .probe() method
>
>     If we are already on a CPU local to the device, call the driver .probe()
>     method directly without using work_on_cpu().
>
>     This is a workaround for a lockdep warning in the following scenario:
>
>       pci_call_probe
>         work_on_cpu(cpu, local_pci_probe, ...)
>           driver .probe
>             pci_enable_sriov
>               ...
>                 pci_bus_add_device
>                   ...
>                     pci_call_probe
>                       work_on_cpu(cpu, local_pci_probe, ...)
>
>     It would be better to fix PCI so we don't call VF driver .probe() methods
>     from inside a PF driver .probe() method, but that's a bigger project.
>
>     [bhelgaas: disable preemption, open bugzilla, rework comments & changelog]
>     Link: https://bugzilla.kernel.org/show_bug.cgi?id=65071
>     Link: http://lkml.kernel.org/r/CAE9FiQXYQEAZ=0sG6+2OdffBqfLS9MpoN1xviRR9aDbxPxcKxQ@mail.gmail.com
>     Link: http://lkml.kernel.org/r/20130624195942.40795.27292.stgit@ahduyck-cp1.jf.intel.com
>     Signed-off-by: Alexander Duyck <alexander.h.duyck@...el.com>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@...gle.com>
>     Acked-by: Tejun Heo <tj@...nel.org>

Tested-by: Yinghai Lu <yinghai@...nel.org>
Acked-by: Yinghai Lu <yinghai@...nel.org>

>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 9042fdbd7244..add04e70ac2a 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -288,12 +288,24 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
>         int error, node;
>         struct drv_dev_and_id ddi = { drv, dev, id };
>
> -       /* Execute driver initialization on node where the device's
> -          bus is attached to.  This way the driver likely allocates
> -          its local memory on the right node without any need to
> -          change it. */
> +       /*
> +        * Execute driver initialization on node where the device is
> +        * attached.  This way the driver likely allocates its local memory
> +        * on the right node.
> +        */
>         node = dev_to_node(&dev->dev);
> -       if (node >= 0) {
> +       preempt_disable();
> +
> +       /*
> +        * On NUMA systems, we are likely to call a PF probe function using
> +        * work_on_cpu().  If that probe calls pci_enable_sriov() (which
> +        * adds the VF devices via pci_bus_add_device()), we may re-enter
> +        * this function to call the VF probe function.  Calling
> +        * work_on_cpu() again will cause a lockdep warning.  Since VFs are
> +        * always on the same node as the PF, we can work around this by
> +        * avoiding work_on_cpu() when we're already on the correct node.
> +        */
> +       if (node >= 0 && node != numa_node_id()) {
>                 int cpu;
>
>                 get_online_cpus();
> @@ -305,6 +317,8 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
>                 put_online_cpus();
>         } else
>                 error = local_pci_probe(&ddi);
> +
> +       preempt_enable();
>         return error;
>  }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ