linux-kernel - Re: [RFC 2/2] PCI: acpiphp: slowdown hotplug if hotplugging multiple devices at a time

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMLWh55dr2e_R+TYVj=8cFfV==D-DfOZvAeq9JEehYs3nw6-OQ@mail.gmail.com>
Date:   Wed, 13 Dec 2023 17:49:39 +0100
From:   Igor Mammedov <imammedo@...hat.com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     linux-kernel@...r.kernel.org,
        Dongli Zhang <dongli.zhang@...cle.com>,
        linux-acpi@...r.kernel.org, linux-pci@...r.kernel.org,
        mst@...hat.com, lenb@...nel.org, bhelgaas@...gle.com,
        mika.westerberg@...ux.intel.com, boris.ostrovsky@...cle.com,
        joe.jin@...cle.com, stable@...r.kernel.org,
        Fiona Ebner <f.ebner@...xmox.com>,
        Thomas Lamprecht <t.lamprecht@...xmox.com>
Subject: Re: [RFC 2/2] PCI: acpiphp: slowdown hotplug if hotplugging multiple
 devices at a time

On Wed, Dec 13, 2023 at 2:08 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Wed, Dec 13, 2023 at 1:36 AM Igor Mammedov <imammedo@...hat.com> wrote:
> >
> > previous commit ("PCI: acpiphp: enable slot only if it hasn't been enabled already"
> > introduced a workaround to avoid a race between SCSI_SCAN_ASYNC job and
> > bridge reconfiguration in case of single HBA hotplug.
> > However in virt environment it's possible to pause machine hotplug several
> > HBAs and let machine run. That can hit the same race when 2nd hotplugged
> > HBA will start re-configuring bridge.
> > Do the same thing as SHPC and throttle down hotplug of 2nd and up
> > devices within single hotplug event.
> >
> > Signed-off-by: Igor Mammedov <imammedo@...hat.com>
> > ---
> >  drivers/pci/hotplug/acpiphp_glue.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> > index 6b11609927d6..30bca2086b24 100644
> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > @@ -37,6 +37,7 @@
> >  #include <linux/mutex.h>
> >  #include <linux/slab.h>
> >  #include <linux/acpi.h>
> > +#include <linux/delay.h>
> >
> >  #include "../pci.h"
> >  #include "acpiphp.h"
> > @@ -700,6 +701,7 @@ static void trim_stale_devices(struct pci_dev *dev)
> >  static void acpiphp_check_bridge(struct acpiphp_bridge *bridge)
> >  {
> >         struct acpiphp_slot *slot;
> > +        int nr_hp_slots = 0;
> >
> >         /* Bail out if the bridge is going away. */
> >         if (bridge->is_going_away)
> > @@ -723,6 +725,10 @@ static void acpiphp_check_bridge(struct acpiphp_bridge *bridge)
> >
> >                         /* configure all functions */
> >                         if (slot->flags != SLOT_ENABLED) {
> > +                               if (nr_hp_slots)
> > +                                       msleep(1000);
>
> Why is 1000 considered the most suitable number here?  Any chance to
> define a symbol for it?

Timeout was borrowed from SHPC hotplug workflow where it apparently
makes race harder to reproduce.
(though it's not excuse to add more timeouts elsewhere)

> And won't this affect the cases when the race in question is not a concern?

In practice it's not likely, since even in virt scenario hypervisor won't
stop VM to hotplug device (which beats whole purpose of hotplug).

But in case of a very slow VM (overcommit case) it's possible for
several HBA's to be hotplugged by the time acpiphp gets a chance
to handle the 1st hotplug event. SHPC is more or less 'safe' with its
1sec delay.

> Also, adding arbitrary timeouts is not the most robust way of
> addressing race conditions IMV.  Wouldn't it be better to add some
> proper synchronization between the pieces of code that can race with
> each other?

I don't like it either, it's a stop gap measure to hide regression on
short notice,
which I can fixup without much risk in short time left, before folks
leave on holidays.
It's fine to drop the patch as chances of this happening are small.
[1/2] should cover reported cases.

Since it's RFC, I basically ask for opinions on a proper way to fix
SCSI_ASYNC_SCAN
running wild while the hotplug is in progress (and maybe SCSI is not
the only user that
schedules async job from device probe). So adding synchronisation and testing
would take time (not something I'd do this late in the cycle).

So far I'm thinking about adding rw mutex to bridge with the PCI
hotplug subsystem
being a writer while scsi scan jobs would be readers and wait till hotplug code
says it's safe to proceed.
I plan to work in this direction and give it some testing, unless
someone has a better idea.

>
> > +
> > +                                ++nr_hp_slots;
> >                                 enable_slot(slot, true);
> >                         }
> >                 } else {
> > --
>