[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e3812c7a1202ee79101406e7003dff9a@codeaurora.org>
Date: Mon, 28 Aug 2017 16:12:01 -0700
From: Vikram Mulukutla <markivx@...eaurora.org>
To: Will Deacon <will.deacon@....com>
Cc: qiaozhou <qiaozhou@...micro.com>,
Thomas Gleixner <tglx@...utronix.de>,
John Stultz <john.stultz@...aro.org>, sboyd@...eaurora.org,
LKML <linux-kernel@...r.kernel.org>,
Wang Wilbur <wilburwang@...micro.com>,
Marc Zyngier <marc.zyngier@....com>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel-owner@...r.kernel.org, sudeep.holla@....com
Subject: Re: [Question]: try to fix contention between expire_timers and
try_to_del_timer_sync
Hi Will,
On 2017-08-25 12:48, Vikram Mulukutla wrote:
> Hi Will,
>
> On 2017-08-15 11:40, Will Deacon wrote:
>> Hi Vikram,
>>
>> On Thu, Aug 03, 2017 at 04:25:12PM -0700, Vikram Mulukutla wrote:
>>> On 2017-07-31 06:13, Will Deacon wrote:
>>> >On Fri, Jul 28, 2017 at 12:09:38PM -0700, Vikram Mulukutla wrote:
>>> >>On 2017-07-28 02:28, Will Deacon wrote:
>>> >>>On Thu, Jul 27, 2017 at 06:10:34PM -0700, Vikram Mulukutla wrote:
>>>
>>> >>>
>>> >>This does seem to help. Here's some data after 5 runs with and without
>>> >>the
>>> >>patch.
>>> >
>>> >Blimey, that does seem to make a difference. Shame it's so ugly! Would you
>>> >be able to experiment with other values for CPU_RELAX_WFE_THRESHOLD? I had
>>> >it set to 10000 in the diff I posted, but that might be higher than
>>> >optimal.
>>> >It would be interested to see if it correlates with num_possible_cpus()
>>> >for the highly contended case.
>>> >
>>> >Will
>>>
>>> Sorry for the late response - I should hopefully have some more data
>>> with
>>> different thresholds before the week is finished or on Monday.
>>
>> Did you get anywhere with the threshold heuristic?
>>
>> Will
>
> Here's some data from experiments that I finally got to today. I
> decided
> to recompile for every value of the threshold. Was doing a binary
> search
> of sorts and then started reducing by orders of magnitude. There pairs
> of rows here:
>
Well here's something interesting. I tried a different platform and
found that
the workaround doesn't help much at all, similar to Qiao's observation
on his b.L
chipset. Something to do with the WFE implementation or event-stream?
I modified your patch to use a __delay(1) in place of the WFEs and this
was
the result (still with the 10k threshold). The worst-case lock time for
cpu0
drastically improves. Given that cpu0 re-enables interrupts between each
lock
attempt in my test case, I think the lock count matters less here.
cpu_relax() patch with WFEs (original workaround):
(pairs of rows, first row is with c0 at 300Mhz, second
with c0 at 1.9GHz. Both rows have cpu4 at 2.3GHz max time
is in microseconds)
------------------------------------------------------|
c0 max time| c0 lock count| c4 max time| c4 lock count|
------------------------------------------------------|
999843| 25| 2| 12988498| -> c0/cpu0 at
300Mhz
0| 8421132| 1| 9152979| -> c0/cpu0 at
1.9GHz
------------------------------------------------------|
999860| 160| 2| 12963487|
1| 8418492| 1| 9158001|
------------------------------------------------------|
999381| 734| 2| 12988636|
1| 8387562| 1| 9128056|
------------------------------------------------------|
989800| 750| 3| 12996473|
1| 8389091| 1| 9112444|
------------------------------------------------------|
cpu_relax() patch with __delay(1):
(pairs of rows, first row is with c0 at 300Mhz, second
with c0 at 1.9GHz. Both rows have cpu4 at 2.3GHz. max time
is in microseconds)
------------------------------------------------------|
c0 max time| c0 lock count| c4 max time| c4 lock count|
------------------------------------------------------|
7703| 1532| 2| 13035203| -> c0/cpu0 at
300Mhz
1| 8511686| 1| 8550411| -> c0/cpu0 at
1.9GHz
------------------------------------------------------|
7801| 1561| 2| 13040188|
1| 8553985| 1| 8609853|
------------------------------------------------------|
3953| 1576| 2| 13049991|
1| 8576370| 1| 8611533|
------------------------------------------------------|
3953| 1557| 2| 13030553|
1| 8509020| 1| 8543883|
------------------------------------------------------|
I should also note that my earlier kernel was 4.9-stable based
and the one above was on a 4.4-stable based kernel.
Thanks,
Vikram
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
Powered by blists - more mailing lists