linux-kernel - Re: [driver-core PATCH v8 2/9] driver core: Establish order of operations for device_add and device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4jDG_2VwFr4oDGCURFrA=B0Bs2HGN9exLE47vDj0P3_8A@mail.gmail.com>
Date:   Mon, 10 Dec 2018 13:15:16 -0800
From:   Dan Williams <dan.j.williams@...el.com>
To:     alexander.h.duyck@...ux.intel.com
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Greg KH <gregkh@...uxfoundation.org>,
        "Luis R. Rodriguez" <mcgrof@...nel.org>,
        linux-nvdimm <linux-nvdimm@...ts.01.org>,
        Tejun Heo <tj@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux-pm mailing list <linux-pm@...r.kernel.org>,
        jiangshanlai@...il.com, "Rafael J. Wysocki" <rafael@...nel.org>,
        "Brown, Len" <len.brown@...el.com>, Pavel Machek <pavel@....cz>,
        zwisler@...nel.org, Dave Jiang <dave.jiang@...el.com>,
        bvanassche@....org
Subject: Re: [driver-core PATCH v8 2/9] driver core: Establish order of
 operations for device_add and device_del via bitflag

On Mon, Dec 10, 2018 at 12:58 PM Alexander Duyck
<alexander.h.duyck@...ux.intel.com> wrote:
>
> On Mon, 2018-12-10 at 11:43 -0800, Dan Williams wrote:
> > On Mon, Dec 10, 2018 at 11:35 AM Alexander Duyck
> > <alexander.h.duyck@...ux.intel.com> wrote:
> > >
> > > On Mon, 2018-12-10 at 10:58 -0800, Dan Williams wrote:
> > > > On Wed, Dec 5, 2018 at 9:25 AM Alexander Duyck
> > > > <alexander.h.duyck@...ux.intel.com> wrote:
> > > > >
> > > > > Add an additional bit flag to the device struct named "dead".
> > > > >
> > > > > This additional flag provides a guarantee that when a device_del is
> > > > > executed on a given interface an async worker will not attempt to attach
> > > > > the driver following the earlier device_del call. Previously this
> > > > > guarantee was not present and could result in the device_del call
> > > > > attempting to remove a driver from an interface only to have the async
> > > > > worker attempt to probe the driver later when it finally completes the
> > > > > asynchronous probe call.
> > > > >
> > > > > Signed-off-by: Alexander Duyck <alexander.h.duyck@...ux.intel.com>
> > > > > ---
> > > > >  drivers/base/core.c    |   11 +++++++++++
> > > > >  drivers/base/dd.c      |    8 ++++++--
> > > > >  include/linux/device.h |    5 +++++
> > > > >  3 files changed, 22 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > > > > index f3e6ca4170b4..70358327303b 100644
> > > > > --- a/drivers/base/core.c
> > > > > +++ b/drivers/base/core.c
> > > > > @@ -2075,6 +2075,17 @@ void device_del(struct device *dev)
> > > > >         struct kobject *glue_dir = NULL;
> > > > >         struct class_interface *class_intf;
> > > > >
> > > > > +       /*
> > > > > +        * Hold the device lock and set the "dead" flag to guarantee that
> > > > > +        * the update behavior is consistent with the other bitfields near
> > > > > +        * it and that we cannot have an asynchronous probe routine trying
> > > > > +        * to run while we are tearing out the bus/class/sysfs from
> > > > > +        * underneath the device.
> > > > > +        */
> > > > > +       device_lock(dev);
> > > > > +       dev->dead = true;
> > > > > +       device_unlock(dev);
> > > > > +
> > > > >         /* Notify clients of device removal.  This call must come
> > > > >          * before dpm_sysfs_remove().
> > > > >          */
> > > > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > > > > index 88713f182086..3bb8c3e0f3da 100644
> > > > > --- a/drivers/base/dd.c
> > > > > +++ b/drivers/base/dd.c
> > > > > @@ -774,6 +774,10 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
> > > > >
> > > > >         device_lock(dev);
> > > > >
> > > > > +       /* device is or has been removed from the bus, just bail out */
> > > > > +       if (dev->dead)
> > > > > +               goto out_unlock;
> > > > > +
> > > >
> > > > What do you think about moving this check into
> > > > __device_attach_driver() alongside all the other checks? That way we
> > > > also get ->dead checking through the __device_attach() path.
> > >
> > > I'm not really sure that is the best spot to do that. Part of the
> > > reason being that by placing it where I did we avoid messing with the
> > > runtime power management for the parent if it was already powered off.
> >
> > ...but this already a rare event and the parent shouldn't otherwise be
> > bothered by a spurious pm_runtime wakeup event.
> >
> > > If anything I would say we could probably look at pulling the check out
> > > and placing the driver check in __device_attach_async_helper since from
> > > what I can tell the check is actually redundant in the non-async path
> > > anyway since __device_attach already had taken the device lock and
> > > checked dev->driver prior to calling __device_attach_driver.
> > >
> > > > ...and after that maybe it could be made a common helper
> > > > (dev_driver_checks()?) shared between __device_attach_driver() and
> > > > __driver_attach() to reduce some duplication.
> > >
> > > I'm not sure consolidating it into a function would really be worth the
> > > extra effort. It would essentially just obfuscate the checks and I am
> > > not sure you really save much with:
> > >         if (dev_driver_checks(dev))
> > > vs:
> > >         if (!dev->dead && !dev->driver)
> > >
> > > By the time you create the function and replace the few spots that are
> > > making these checks you would end up most likely adding more complexity
> > > to the kernel rather than reducing it any.
> >
> > No, I was talking about removing this duplication in
> > __device_attach_driver() and __driver_attach():
> >
> >         if (ret == 0) {
> >                 /* no match */
> >                 return 0;
> >         } else if (ret == -EPROBE_DEFER) {
> >                 dev_dbg(dev, "Device match requests probe deferral\n");
> >                 driver_deferred_probe_add(dev);
>
> Is this bit of code correct? Seems like there should be a return here
> doesn't it?

It does look odd, but I think it's ok as the driver is expected to
have its probe routine called multiple times and return -EPROBE_DEFER
if it's not ready yet.

> I just double checked and this is what is in the kernel too.

Yeah, I just copy-pasted it, but it might deserve a comment that the
fallthrough / no return is on purpose.

> >         } else if (ret < 0) {
> >                 dev_dbg(dev, "Bus failed to match device: %d", ret);
> >                 return ret;
> >         } /* ret > 0 means positive match */
> >
> > ...and lead in with a dev->dead check.
>
> I would think that we would want to check for dev->dead before we even
> call driver_match_device. That way we don't have the match function
> crawling around a device that is being disassembled. Is that what you
> were referring to?

I wasn't too concerned about optimizing the case where the probe path
loses the race with device_del().

> Also the context for the two functions seems to be a bit different. In
> the case of __device_attach_driver the device_lock is already held. In
> __driver_attach the lock on the device isn't taken until after a match
> has been found.

Yes, I was only pattern matching when looking at the context of where
dev->dead is checked in __driver_attach() and wondering why it was
checked outside of __device_attach_driver()