[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <537E6C9F.5060406@linux.vnet.ibm.com>
Date: Fri, 23 May 2014 03:01:11 +0530
From: "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Borislav Petkov <bp@...en8.de>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Jacob Pan <jacob.jun.pan@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>,
Borislav Petkov <bp@...e.de>, Ingo Molnar <mingo@...nel.org>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
"ego@...ux.vnet.ibm.com" <ego@...ux.vnet.ibm.com>,
Oleg Nesterov <oleg@...hat.com>
Subject: Re: [PATCH] intel_rapl: Correct hotplug correction
On 05/22/2014 06:02 PM, Peter Zijlstra wrote:
> On Thu, May 22, 2014 at 05:24:33PM +0530, Srivatsa S. Bhat wrote:
>> Yeah, its complicated and perhaps we can do much better than that. But I'll
>> try to explain why there are so many different locks in the existing code.
>>
[...]
>
> So I think we can reduce it to just the one rwsem (with recursion) if we
> shoot CPU_POST_DEAD in the head.
>
Ok, I'll take a look at the cpufreq core and see how we can get rid of the
POST_DEAD case there. I myself had added that (sorry!) to solve a complicated
deadlock involving a race between CPU offline and a task writing to one of
the cpufreq sysfs files. The sysfs writer task would increment the kobject
refcount and call get_online_cpus(), whereas the CPU offline task would wait
for the kobj refcount to drop to zero, while still holding the hotplug lock.
Thus the 2 tasks would end up waiting on each other indefinitely.
So using POST_DEAD had enabled us to wait for the refcount to drop to zero
without holding the hotplug lock, which allowed the sysfs writer to get
past get_online_cpus(), finish its job and finally drop the refcount.
Anyway, I'll take a fresh look to see if we can overcome that problem in
some other way.
> Because currently we cannot take the rwsem in exclusive mode over the
> whole thing because of POST_DEAD.
>
> Once we kill that, the hotplug lock's exclusive mode can cover the
> entire hotplug operation.
>
> For (un)registrer we can also use the exclusive lock, (un)register of
> notifiers should not happen often and should equally not be performance
> critical, so using the exclusive lock should be just fine.
>
> That means we can then remove cpu_add_remove_lock from both the register
> and hotplug ops proper. (un)register_cpu_notifier() should get an
> assertion that we hold the hotplug lock in exclusive mode.
>
> That leaves the non-exclusive lock to guard against hotplug happening.
>
> Now, last time Linus said he would like that to be a non-lock, and have
> it weakly serialized, RCU style. Not sure we can fully pull that off,
> haven't throught that through yet.
Thank you for explanation!
>
>> I think Oleg had a proposed patch to use per-cpu rwsem in CPU hotplug to
>> drastically simplify this whole locking scheme. I think we could look at
>> that again.
>
> I don't think that was to simplify things, the hotplug lock is basically
> an open coded rw lock already, so that was to make it reuse the per-cpu
> rwsem code.
>
Ah, ok!
Regards,
Srivatsa S. Bhat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists