[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090810221307.GA19236@sig21.net>
Date: Tue, 11 Aug 2009 00:13:07 +0200
From: Johannes Stezenbach <js@...21.net>
To: Ingo Molnar <mingo@...e.hu>
Cc: Robert Richter <robert.richter@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Andi Kleen <andi@...stfloor.org>, x86@...nel.org,
linux-kernel@...r.kernel.org, "Rafael J. Wysocki" <rjw@...k.pl>
Subject: Re: 2.6.31-rc5 regression: x86 MCE malfunction on Thinkpad T42p
On Mon, Aug 10, 2009 at 11:31:33PM +0200, Ingo Molnar wrote:
> * Johannes Stezenbach <js@...21.net> wrote:
> >
> > # cat /proc/cpuinfo
> > processor : 0
> > vendor_id : GenuineIntel
> > cpu family : 6
> > model : 13
> > model name : Intel(R) Pentium(R) M processor 1.80GHz
>
> ah, yes. There's no cache-references/misses, because in
> arch/x86/kernel/cpu/perf_counter.c we have two zero entries:
>
> static const u64 p6_perfmon_event_map[] =
> {
> [PERF_COUNT_HW_CPU_CYCLES] = 0x0079,
> [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0,
> [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0000, <----------
> [PERF_COUNT_HW_CACHE_MISSES] = 0x0000, <----------
> [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c4,
> [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c5,
> [PERF_COUNT_HW_BUS_CYCLES] = 0x0062,
> };
>
> i.e. PERF_COUNT_HW_CACHE_REFERENCES and PERF_COUNT_HW_CACHE_MISSES
> is not filled in yet.
>
> Could you try something like:
>
> perf stat -e r0f2e true
>
> (0x2e: L2 requests, 0x0f: all units)
>
> if i checked the docs right that counter would give us L2 cache
> stats - does it display non-zero values?
# ./perf stat -e r0f2e true
Performance counter stats for 'true':
10584 raw 0xf2e
0.001159924 seconds time elapsed
The number also increases for larger programs than "true".
According to /usr/share/oprofile/i386/p6_mobile/events and
http://oprofile.sourceforge.net/docs/intel-p6-mobile-events.php
0x2e + 0x0f is "L2 requests, all units", but I couldn't say how
to count cache references vs. misses. Or does it work
with unit mask 0x0e vs. 0x01?
# ./perf stat -e r0e2e true
Performance counter stats for 'true':
10147 raw 0xe2e
0.001121651 seconds time elapsed
# ./perf stat -e r012e true
Performance counter stats for 'true':
468 raw 0x12e
0.001130870 seconds time elapsed
> > Could the warning be caused by the cpufreq ondemand governor? ISTR
> > that one should switch to the performance governor before doing
> > any profiling, but I forgot for this test.
>
> there might be a connection - it could in theory cause sched_clock()
> transients and confuse the ring-buffer time-stamping.
I'll try tomorrow after a fresh boot if the warning also appears
with the performance governor.
Thanks
Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists