[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <597B109EC20B76429F71A8A97770610D12A63C81@ALA-MBD.corp.ad.wrs.com>
Date: Mon, 18 Mar 2019 01:41:10 +0000
From: "Liu, Yongxin" <Yongxin.Liu@...driver.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-rt-users@...r.kernel.org" <linux-rt-users@...r.kernel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"rostedt@...dmis.org" <rostedt@...dmis.org>,
"dan.j.williams@...el.com" <dan.j.williams@...el.com>,
"pagupta@...hat.com" <pagupta@...hat.com>,
"Gortmaker, Paul" <Paul.Gortmaker@...driver.com>,
"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>
Subject: RE: [PATCH RT] nvdimm: make lane acquirement RT aware
> -----Original Message-----
> From: linux-kernel-owner@...r.kernel.org [mailto:linux-kernel-
> owner@...r.kernel.org] On Behalf Of Sebastian Andrzej Siewior
> Sent: Saturday, March 16, 2019 00:43
> To: Liu, Yongxin
> Cc: linux-kernel@...r.kernel.org; linux-rt-users@...r.kernel.org;
> tglx@...utronix.de; rostedt@...dmis.org; dan.j.williams@...el.com;
> pagupta@...hat.com; Gortmaker, Paul; linux-nvdimm@...ts.01.org
> Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware
>
> On 2019-03-11 00:44:58 [+0000], Liu, Yongxin wrote:
> > > but you still have the ndl_lock->lock which protects the resource. So
> in
> > > the unlikely (but possible event) that you switch CPUs after
> obtaining
> > > the CPU number you block on the lock. No harm is done, right?
> >
> > The resource "lane" can be acquired recursively, so "ndl_lock->lock" is
> a conditional lock.
> >
> > ndl_count->count is per CPU.
> > ndl_lock->lock is per lane.
> >
> > Here is an example:
> > Thread A on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> get
> "ndl_lock->lock"
> > --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" due
> to "ndl_count->count++".
> >
> > Thread B on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> bypass
> "ndl_lock->lock" ("ndl_count->count"
> > was changed by Thread A)
> >
> > If we use raw_smp_processor_id(), no matter which CPU the thread was
> migrated to,
> > if there is another thread running on the old CPU, there will be race
> condition
> > due to per CPU variable "ndl_count->count".
>
> so I've been looking at it again. The recursive locking could have been
> solved better. Like the local_lock() on -RT is doing it.
> Given that you lock with preempt_disable() there should be no in-IRQ
> usage.
> But in the "nd_region->num_lanes >= nr_cpu_ids" case you don't take any
> locks. That would be a problem with raw_smp_processor_id() approach.
>
> So what about the completely untested patch here:
>
> diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
> index 379bf4305e615..98c2e9df4b2e4 100644
> --- a/drivers/nvdimm/nd.h
> +++ b/drivers/nvdimm/nd.h
> @@ -109,7 +109,8 @@ unsigned sizeof_namespace_label(struct nvdimm_drvdata
> *ndd);
> res; res = next, next = next ? next->sibling : NULL)
>
> struct nd_percpu_lane {
> - int count;
> + struct task_struct *owner;
> + int nestcnt;
> spinlock_t lock;
> };
>
> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> index e2818f94f2928..8a62f9833513f 100644
> --- a/drivers/nvdimm/region_devs.c
> +++ b/drivers/nvdimm/region_devs.c
> @@ -946,19 +946,17 @@ int nd_blk_region_init(struct nd_region *nd_region)
> */
> unsigned int nd_region_acquire_lane(struct nd_region *nd_region)
> {
> + struct nd_percpu_lane *ndl_lock;
> unsigned int cpu, lane;
>
> - cpu = get_cpu();
> - if (nd_region->num_lanes < nr_cpu_ids) {
> - struct nd_percpu_lane *ndl_lock, *ndl_count;
> -
> - lane = cpu % nd_region->num_lanes;
> - ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> - ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> - if (ndl_count->count++ == 0)
> - spin_lock(&ndl_lock->lock);
> - } else
> - lane = cpu;
> + cpu = raw_smp_processor_id();
> + lane = cpu % nd_region->num_lanes;
> + ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> + if (ndl_lock->owner != current) {
> + spin_lock(&ndl_lock->lock);
> + ndl_lock->owner = current;
> + }
> + ndl_lock->nestcnt++;
>
> return lane;
> }
> @@ -966,17 +964,16 @@ EXPORT_SYMBOL(nd_region_acquire_lane);
>
> void nd_region_release_lane(struct nd_region *nd_region, unsigned int
> lane)
> {
> - if (nd_region->num_lanes < nr_cpu_ids) {
> - unsigned int cpu = get_cpu();
> - struct nd_percpu_lane *ndl_lock, *ndl_count;
> + struct nd_percpu_lane *ndl_lock;
>
> - ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> - ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> - if (--ndl_count->count == 0)
> - spin_unlock(&ndl_lock->lock);
> - put_cpu();
> - }
> - put_cpu();
> + ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> + WARN_ON(ndl_lock->nestcnt == 0);
> + WARN_ON(ndl_lock->owner != current);
> + if (--ndl_lock->nestcnt)
> + return;
> +
> + ndl_lock->owner = NULL;
> + spin_unlock(&ndl_lock->lock);
> }
> EXPORT_SYMBOL(nd_region_release_lane);
>
> @@ -1042,7 +1039,8 @@ static struct nd_region *nd_region_create(struct
> nvdimm_bus *nvdimm_bus,
>
> ndl = per_cpu_ptr(nd_region->lane, i);
> spin_lock_init(&ndl->lock);
> - ndl->count = 0;
> + ndl->owner = NULL;
> + ndl->nestcnt = 0;
> }
>
> for (i = 0; i < ndr_desc->num_mappings; i++) {
>
> > Thanks,
> > Yongxin
>
Consider the recursive call to nd_region_acquire_lane() in the following situation.
Will there be a dead lock?
Thread A Thread B
| |
| |
CPU 1 CPU 2
| |
| |
get lock for Lane 1 get lock for Lane 2
| |
| |
migrate to CPU 2 migrate to CPU 1
| |
| |
wait lock for Lane 2 wait lock for Lane 1
| |
| |
_____________________________
|
dead lock ?
Thanks,
Yognxin
> Sebastian
Powered by blists - more mailing lists