[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4hwsKCbaxDhAL6LPZQmLZZV2T4wpja+cNZpjy=enR-eYQ@mail.gmail.com>
Date: Thu, 14 Apr 2022 12:43:31 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-cxl@...r.kernel.org,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Dave Jiang <dave.jiang@...el.com>,
Kevin Tian <kevin.tian@...el.com>,
Vishal L Verma <vishal.l.verma@...el.com>,
"Schofield, Alison" <alison.schofield@...el.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux NVDIMM <nvdimm@...ts.linux.dev>
Subject: Re: [PATCH v2 02/12] device-core: Add dev->lock_class to enable
device_lock() lockdep validation
On Thu, Apr 14, 2022 at 12:34 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Thu, Apr 14, 2022 at 10:17:13AM -0700, Dan Williams wrote:
>
> > One more sanity check... So in driver subsystems there are cases where
> > a device on busA hosts a topology on busB. When that happens there's a
> > need to set the lock class late in a driver since busA knows nothing
> > about the locking rules of busB.
>
> I'll pretend I konw what you're talking about ;-)
>
> > Since the device has a longer lifetime than a driver when the driver
> > exits it must set dev->mutex back to the novalidate class, otherwise
> > it sets up a use after free of the static lock_class_key.
>
> I'm not following, static storage has infinite lifetime.
Not static storage in a driver module.
modprobe -r fancy_lockdep_using_driver.ko
Any use of device_lock() by the core on a device that a driver in this
module was driving will de-reference a now invalid pointer into
whatever memory was vmalloc'd for the module static data.
>
> > I came up with this and it seems to work, just want to make sure I'm
> > properly using the lock_set_class() API and it is ok to transition
> > back and forth from the novalidate case:
> >
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 990b6670222e..32673e1a736d 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -405,6 +405,29 @@ struct cxl_nvdimm_bridge
> > *cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd);
> > #define __mock static
> > #endif
> >
> > +#ifdef CONFIG_PROVE_LOCKING
> > +static inline void cxl_lock_reset_class(void *_dev)
> > +{
> > + struct device *dev = _dev;
> > +
> > + lock_set_class(&dev->mutex.dep_map, "__lockdep_no_validate__",
> > + &__lockdep_no_validate__, 0, _THIS_IP_);
> > +}
> > +
> > +static inline int cxl_lock_set_class(struct device *dev, const char *name,
> > + struct lock_class_key *key)
> > +{
> > + lock_set_class(&dev->mutex.dep_map, name, key, 0, _THIS_IP_);
> > + return devm_add_action_or_reset(dev, cxl_lock_reset_class, dev);
> > +}
> > +#else
> > +static inline int cxl_lock_set_class(struct device *dev, const char *name,
> > + struct lock_class_key *key)
> > +{
> > + return 0;
> > +}
> > +#endif
>
> Under the assumption that the lock is held (lock_set_class() will
> actually barf if @lock isn't held) this should indeed work as expected
Nice.
> (although I think you got the @name part 'wrong', I think that's
> canonically something like "&dev->mutex" or something).
Ah, yes, I got it wrong for restoring the same name that results from
lockdep_set_novalidate_class(), will fix.
Powered by blists - more mailing lists