[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160204055108.GY3469@vireshk>
Date: Thu, 4 Feb 2016 11:21:08 +0530
From: Viresh Kumar <viresh.kumar@...aro.org>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Shilpa Bhat <shilpabhatppc@...il.com>,
Juri Lelli <juri.lelli@....com>,
Rafael Wysocki <rjw@...ysocki.net>,
Lists linaro-kernel <linaro-kernel@...ts.linaro.org>,
"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
Saravana Kannan <skannan@...eaurora.org>,
Peter Zijlstra <peterz@...radead.org>,
Michael Turquette <mturquette@...libre.com>,
Steve Muckle <steve.muckle@...aro.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Morten Rasmussen <morten.rasmussen@....com>,
dietmar.eggemann@....com,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH V2 0/7] cpufreq: governors: Fix ABBA lockups
On 04-02-16, 00:50, Rafael J. Wysocki wrote:
> On Thu, Feb 4, 2016 at 12:31 AM, Shilpa Bhat <shilpabhatppc@...il.com> wrote:
> > Sorry for the delayed report. But I see the below backtrace on Power8 box. It
> > has 4 chips with 128 cpus.
Honestly, I wasn't expecting you to test this stuff, but I really
appreciate you doing that.
Thanks a lot ..
> > [ 906.765768] Possible unsafe locking scenario:
> >
> > [ 906.765880] CPU0 CPU1
> > [ 906.765969] ---- ----
This race scenario is perhaps incomplete and difficult to understand
without below lines:
Governor's EXIT Update sampling rate from sysfs
lock(s_active#91);
> > [ 906.766058] lock(od_dbs_cdata.mutex);
> > [ 906.766170] lock(&dbs_data->mutex);
> > [ 906.766304] lock(od_dbs_cdata.mutex);
> > [ 906.766461] lock(s_active#91);
> > [ 906.766572]
> > *** DEADLOCK ***
>
> This is exactly right. We've avoided one deadlock only to trip into
> another one.
As we discussed on IRC, we haven't introduced this deadlock with the
current series. But this is what Juri has reported some days back,
while he tested linus/master on TC2.
> This happens because update_sampling_rate() acquires
> od_dbs_cdata.mutex which is held around cpufreq_governor_exit() by
> cpufreq_governor_dbs().
>
> Worse yet, a deadlock can still happen without (the new)
> dbs_data->mutex, just between s_active and od_dbs_cdata.mutex if
> update_sampling_rate() runs in parallel with
> cpufreq_governor_dbs()->cpufreq_governor_exit() and the latter wins
> the race.
>
> It looks like we need to drop the governor mutex before putting the
> kobject in cpufreq_governor_exit().
That wouldn't be trivial to implement as we discussed.
Okay, here is a proposal for the current series and the series's you
have post Rafael:
- Firstly, I would like to clarify that I don't have any issues with
rebasing on top of your series, it should be easy enough.
- One thing is for sure that nothing from these 3 series's is getting
merged in 4.5, as we aren't fixing the real issue Shilpa/Juril have
reported.
- I think the first 4 patches here are just fine and don't need any
updates. They actually do the right thing and makes code so much
cleaner.
- So, can we apply the first 4 patches (which you have already
applied to bleeding-edge) now and do all work on top of that ?
Again, I can rebase if you merge your patches first, no issues at all
:)
--
viresh
Powered by blists - more mailing lists