[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <06ed6ba2-00c4-ab38-4fcf-611133865615@redhat.com>
Date: Wed, 13 Apr 2022 10:02:35 -0400
From: Waiman Long <longman@...hat.com>
To: Dan Williams <dan.j.williams@...el.com>, linux-cxl@...r.kernel.org
Cc: Ira Weiny <ira.weiny@...el.com>, Dave Jiang <dave.jiang@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Ben Widawsky <ben.widawsky@...el.com>,
Kevin Tian <kevin.tian@...el.com>,
Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>,
Alison Schofield <alison.schofield@...el.com>,
Boqun Feng <boqun.feng@...il.com>,
Ingo Molnar <mingo@...hat.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Will Deacon <will@...nel.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
linux-kernel@...r.kernel.org, nvdimm@...ts.linux.dev
Subject: Re: [PATCH v2 00/12] device-core: Enable device_lock() lockdep
validation
On 4/13/22 02:01, Dan Williams wrote:
> Changes since v1 [1]:
> - Improve the clarity of the cover letter and changelogs of the
> major patches (Patch2 and Patch12) (Pierre, Kevin, and Dave)
> - Fix device_lock_interruptible() false negative deadlock detection
> (Kevin)
> - Fix off-by-one error in the device_set_lock_class() enable case (Kevin)
> - Spelling fixes in Patch2 changelog (Pierre)
> - Compilation fixes when both CONFIG_CXL_BUS=n and
> CONFIG_LIBNVDIMM=n. (0day robot)
>
> [1]: https://lore.kernel.org/all/164610292916.2682974.12924748003366352335.stgit@dwillia2-desk3.amr.corp.intel.com/
>
> ---
>
> The device_lock() is why the lockdep_set_novalidate_class() API exists.
> The lock is taken in too many disparate contexts, and lockdep by design
> assumes that all device_lock() acquisitions are identical. The lack of
> lockdep coverage leads to deadlock scenarios landing upstream. To
> mitigate that problem the lockdep_mutex was added [2].
>
> The lockdep_mutex lets a subsystem mirror device_lock() acquisitions
> without lockdep_set_novalidate_class() to gain some limited lockdep
> coverage. The mirroring approach is limited to taking the device_lock()
> after-the-fact in a subsystem's 'struct bus_type' operations and fails
> to cover device_lock() acquisition in the driver-core. It also can only
> track the needs of one subsystem at a time so, for example the kernel
> needs to be recompiled between CONFIG_PROVE_NVDIMM_LOCKING and
> CONFIG_PROVE_CXL_LOCKING depending on which subsystem is being
> regression tested. Obviously that also means that intra-subsystem
> locking dependencies can not be validated.
Instead of using a fake lockdep_mutex, maybe you can just use a unique
lockdep key for each subsystem and call lockdep_set_class() in the
device_initialize() if such key is present or
lockdep_set_novalidate_class() otherwise. The unique key can be passed
either as a parameter to device_initialize() or as part of the device
structure. It is certainly less cumbersome that having a fake
lockdep_mutex array in the device structure.
Cheers,
Longman
Powered by blists - more mailing lists