linux-kernel - RE: [patch] x86, perf_counter, bts: optimize BTS overflow handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <928CFBE8E7CB0040959E56B4EA41A77EC47B0F20@irsmsx504.ger.corp.intel.com>
Date:	Tue, 15 Sep 2009 13:07:26 +0100
From:	"Metzger, Markus T" <markus.t.metzger@...el.com>
To:	Pavel Machek <pavel@....cz>
CC:	"mingo@...e.hu" <mingo@...e.hu>,
	"tglx@...utronix.de" <tglx@...utronix.de>,
	"hpa@...or.com" <hpa@...or.com>,
	"markus.t.metzger@...il.com" <markus.t.metzger@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"a.p.zijlstra@...llo.nl" <a.p.zijlstra@...llo.nl>
Subject: RE: [patch] x86, perf_counter, bts: optimize BTS overflow handling

>-----Original Message-----
>From: Pavel Machek [mailto:pavel@....cz]
>Sent: Tuesday, September 15, 2009 1:34 PM
>To: Metzger, Markus T
>Cc: mingo@...e.hu; tglx@...utronix.de; hpa@...or.com; markus.t.metzger@...il.com; linux-
>kernel@...r.kernel.org; a.p.zijlstra@...llo.nl
>Subject: Re: [patch] x86, perf_counter, bts: optimize BTS overflow handling
>
>On Tue 2009-09-08 08:31:10, Markus Metzger wrote:
>> Draining the BTS buffer on a buffer overflow interrupt takes too long
>> resulting in a kernel lockup when tracing the kernel.
>
>Wait. If 'takes too long' leads to 'lockup'...solution is not making
>it faster. Solution should be making kernel robust -- maybe disable
>tracing when it can't keep up or something...? or maybe disabling
>interrupt until you handle the previous one...?
>								Pavel

Maybe I got the term 'lockup' wrong? The symptom is that the system gets slower
and then seems to freeze. When I disable tracing in such a frozen system (via
a debugger), it recovers. When I attached the debugger, one core had been busy
processing the BTS overflow, the other core had been waiting for an smp call
response.

BTS overflow interrupts are processed one after the other. BTS is disabled
as long as we handle the interrupt. We do not trace overflow handling; but we
trace all the rest.

The problem as I see it is that the kernel generates too much trace and it
takes too long to generate samples out of it so that it does not make progress,
any more - or very very slow progress.

Optimizing sample generation allows it to make progress fast enough.

regards,
markus.

---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/