lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBRdK5WwokEWE3tQZiAyO3pWbS9aUn7HUkQT+XsMYfJUiA@mail.gmail.com>
Date:	Mon, 8 Jul 2013 22:20:21 +0200
From:	Stephane Eranian <eranian@...gle.com>
To:	Dave Hansen <dave.hansen@...el.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	"mingo@...e.hu" <mingo@...e.hu>, dave.hansen@...ux.intel.com,
	"ak@...ux.intel.com" <ak@...ux.intel.com>,
	Jiri Olsa <jolsa@...hat.com>
Subject: Re: [PATCH] perf: fix interrupt handler timing harness

On Mon, Jul 8, 2013 at 10:05 PM, Dave Hansen <dave.hansen@...el.com> wrote:
> On 07/08/2013 11:08 AM, Stephane Eranian wrote:
>> I admit I have some issues with your patch and what it is trying to avoid.
>> There is already interrupt throttling. Your code seems to address latency
>> issues on the handler rather than rate issues. Yet to mitigate the latency
>> it is modify the throttling.
>
> If we have too many interrupts, we need to drop the rate (existing
> throttling).
>
> If the interrupts _consistently_ take too long individually they can
> starve out all the other CPU users.  I saw no way to make them finish
> faster, so the only recourse is to also drop the rate.
>
I think we need to investigate why some interrupts take so much time.
Could be HW, could be SW. Not talking about old hardware here.
Once we understand this, then we know maybe adjust the timing on
our patch.

>> For some unknown reasons, my HSW interrupt handler goes crazy for
>> a while running a very simple:
>>    $ perf record -e cycles branchy_loop
>>
>> And I do see in the log:
>> perf samples too long (2546 > 2500), lowering
>> kernel.perf_event_max_sample_rate to 50000
>>
>> Which is an enormous latency. I instrumented the code, and under
>> normal conditions the latency
>> of the handler for this perf run, is about 500ns and it is consistent
>> with what I see on SNB.
>
> I was seeing latencies near 1 second from time to time, but
> _consistently_ in the hundreds of milliseconds.

On my systems, I see 500ns with one session running. But on HSW,
something else is going on with bursts at 2500ns. That's not normal.
I want an explanation for this.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ