[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBSR4a7PCyYxhE0kpepc3f0f6Rh6JU55_A3ubWYnN+G3iA@mail.gmail.com>
Date: Wed, 4 Jan 2012 21:33:27 +0000
From: Stephane Eranian <eranian@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, mingo@...e.hu, gleb@...hat.com,
asharma@...com, vince@...ter.net, wcohen@...hat.com
Subject: Re: perf_events: proposed fix for broken intr throttling (repost)
On Wed, Jan 4, 2012 at 9:24 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Wed, 2012-01-04 at 15:39 +0100, Stephane Eranian wrote:
>
>> In running some tests with 3.2.0-rc7-tip, I noticed unexpected throttling
>> notification samples. I was using fixed period with a long enough period
>> that I could not possibly hit the default limit of 100000 samples/sec/cpu.
>>
>> I investigated the matter and discovered that the following commit
>> is the culprit:
>>
>> commit 0f5a2601284237e2ba089389fd75d67f77626cef
>> Author: Peter Zijlstra <a.p.zijlstra@...llo.nl>
>> Date: Wed Nov 16 14:38:16 2011 +0100
>>
>> perf: Avoid a useless pmu_disable() in the perf-tick
>>
>>
>> The throttling mechanism REQUIRES that the hwc->interrupt counter be reset
>> at EACH timer tick. This is regardless of the fact that the counter is in fixed
>> period or frequency mode. The optimization introduced in this patch breaks this
>> by avoiding calling perf_ctx_adjust_freq() at each timer tick. For events with
>> fixed period, it would not adjust any period at all BUT it would reset the
>> throttling counter.
>>
>> Given the way the throttling mechanism is implemented we cannot avoid doing
>> some work at each timer tick. Otherwise we loose many samples for no good
>> reasons.
>>
>> One may also question the motivation behind checking the interrupt rate at
>> each timer tick rather than every second, i.e., average it out over a longer
>> period.
>
> That also allows your system to be dead for longer..
>
Yes, I concur...
>> I see two solutions short term:
>> 1 - revert the commit above
>> 2 - special case the situation with no frequency-based sampling event
>>
>> I have implemented solution 2 with the draft fix below. It does not invoke
>> perf_pmu_enable()/perf_pmu_disable(). I am not clear on whether or not this
>> is really needed in this case. Please advise.
>
> I don't think it needs that, I do dislike the unconditional iterate all
> events thing though. Maybe we can set some per-cpu state indicating
> someone got throttled (rare under normal operation -- you'd hope) and
> only iterate to unthrottle when we find this set.
>
Could try that too.
> I think the event scheduling resulting from migration will already
> re-enable the event, avoiding the loss of unthrottle due to that..
> although it would be good to verify that.
>
Yes, you're not dead forever, but still it is not acceptable as is.
Will code the solution you suggested.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists