lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090810221307.GA19236@sig21.net>
Date:	Tue, 11 Aug 2009 00:13:07 +0200
From:	Johannes Stezenbach <js@...21.net>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Robert Richter <robert.richter@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Andi Kleen <andi@...stfloor.org>, x86@...nel.org,
	linux-kernel@...r.kernel.org, "Rafael J. Wysocki" <rjw@...k.pl>
Subject: Re: 2.6.31-rc5 regression: x86 MCE malfunction on Thinkpad T42p

On Mon, Aug 10, 2009 at 11:31:33PM +0200, Ingo Molnar wrote:
> * Johannes Stezenbach <js@...21.net> wrote:
> > 
> > # cat /proc/cpuinfo 
> > processor	: 0
> > vendor_id	: GenuineIntel
> > cpu family	: 6
> > model		: 13
> > model name	: Intel(R) Pentium(R) M processor 1.80GHz
> 
> ah, yes. There's no cache-references/misses, because in 
> arch/x86/kernel/cpu/perf_counter.c we have two zero entries:
> 
> static const u64 p6_perfmon_event_map[] =
> {
>   [PERF_COUNT_HW_CPU_CYCLES]            = 0x0079,
>   [PERF_COUNT_HW_INSTRUCTIONS]          = 0x00c0,
>   [PERF_COUNT_HW_CACHE_REFERENCES]      = 0x0000, <----------
>   [PERF_COUNT_HW_CACHE_MISSES]          = 0x0000, <----------
>   [PERF_COUNT_HW_BRANCH_INSTRUCTIONS]   = 0x00c4,
>   [PERF_COUNT_HW_BRANCH_MISSES]         = 0x00c5,
>   [PERF_COUNT_HW_BUS_CYCLES]            = 0x0062,
> };
> 
> i.e. PERF_COUNT_HW_CACHE_REFERENCES and PERF_COUNT_HW_CACHE_MISSES 
> is not filled in yet.
> 
> Could you try something like:
> 
>     perf stat -e r0f2e true
> 
> (0x2e: L2 requests, 0x0f: all units)
> 
> if i checked the docs right that counter would give us L2 cache 
> stats - does it display non-zero values?

# ./perf stat -e r0f2e true

 Performance counter stats for 'true':

          10584  raw 0xf2e               

    0.001159924  seconds time elapsed

The number also increases for larger programs than "true".

According to /usr/share/oprofile/i386/p6_mobile/events and
http://oprofile.sourceforge.net/docs/intel-p6-mobile-events.php
0x2e + 0x0f is "L2 requests, all units", but I couldn't say how
to count cache references vs. misses.  Or does it work
with unit mask 0x0e vs. 0x01?

# ./perf stat -e r0e2e true

 Performance counter stats for 'true':

          10147  raw 0xe2e               

    0.001121651  seconds time elapsed

# ./perf stat -e r012e true

 Performance counter stats for 'true':

            468  raw 0x12e               

    0.001130870  seconds time elapsed


> > Could the warning be caused by the cpufreq ondemand governor? ISTR 
> > that one should switch to the performance governor before doing 
> > any profiling, but I forgot for this test.
> 
> there might be a connection - it could in theory cause sched_clock() 
> transients and confuse the ring-buffer time-stamping.

I'll try tomorrow after a fresh boot if the warning also appears
with the performance governor.


Thanks
Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ