lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0gowV0WJd8pjwrDyHSJPvwgkCXYu9bDG7HHfcyzkSSY6w@mail.gmail.com>
Date:   Wed, 13 Dec 2023 14:07:59 +0100
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Igor Mammedov <imammedo@...hat.com>
Cc:     linux-kernel@...r.kernel.org,
        Dongli Zhang <dongli.zhang@...cle.com>,
        linux-acpi@...r.kernel.org, linux-pci@...r.kernel.org,
        mst@...hat.com, rafael@...nel.org, lenb@...nel.org,
        bhelgaas@...gle.com, mika.westerberg@...ux.intel.com,
        boris.ostrovsky@...cle.com, joe.jin@...cle.com,
        stable@...r.kernel.org, Fiona Ebner <f.ebner@...xmox.com>,
        Thomas Lamprecht <t.lamprecht@...xmox.com>
Subject: Re: [RFC 2/2] PCI: acpiphp: slowdown hotplug if hotplugging multiple
 devices at a time

On Wed, Dec 13, 2023 at 1:36 AM Igor Mammedov <imammedo@...hat.com> wrote:
>
> previous commit ("PCI: acpiphp: enable slot only if it hasn't been enabled already"
> introduced a workaround to avoid a race between SCSI_SCAN_ASYNC job and
> bridge reconfiguration in case of single HBA hotplug.
> However in virt environment it's possible to pause machine hotplug several
> HBAs and let machine run. That can hit the same race when 2nd hotplugged
> HBA will start re-configuring bridge.
> Do the same thing as SHPC and throttle down hotplug of 2nd and up
> devices within single hotplug event.
>
> Signed-off-by: Igor Mammedov <imammedo@...hat.com>
> ---
>  drivers/pci/hotplug/acpiphp_glue.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index 6b11609927d6..30bca2086b24 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -37,6 +37,7 @@
>  #include <linux/mutex.h>
>  #include <linux/slab.h>
>  #include <linux/acpi.h>
> +#include <linux/delay.h>
>
>  #include "../pci.h"
>  #include "acpiphp.h"
> @@ -700,6 +701,7 @@ static void trim_stale_devices(struct pci_dev *dev)
>  static void acpiphp_check_bridge(struct acpiphp_bridge *bridge)
>  {
>         struct acpiphp_slot *slot;
> +        int nr_hp_slots = 0;
>
>         /* Bail out if the bridge is going away. */
>         if (bridge->is_going_away)
> @@ -723,6 +725,10 @@ static void acpiphp_check_bridge(struct acpiphp_bridge *bridge)
>
>                         /* configure all functions */
>                         if (slot->flags != SLOT_ENABLED) {
> +                               if (nr_hp_slots)
> +                                       msleep(1000);

Why is 1000 considered the most suitable number here?  Any chance to
define a symbol for it?

And won't this affect the cases when the race in question is not a concern?

Also, adding arbitrary timeouts is not the most robust way of
addressing race conditions IMV.  Wouldn't it be better to add some
proper synchronization between the pieces of code that can race with
each other?

> +
> +                                ++nr_hp_slots;
>                                 enable_slot(slot, true);
>                         }
>                 } else {
> --

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ