linux-kernel - Re: [PATCH] perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160818162522.GB5049@nazgul.tnic>
Date:   Thu, 18 Aug 2016 18:25:22 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     Matt Fleming <matt@...eblueprint.co.uk>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH] perf/x86/amd: Make HW_CACHE_REFERENCES and
 HW_CACHE_MISSES measure L2

On Mon, Aug 15, 2016 at 04:13:16PM +0100, Matt Fleming wrote:
> They're referred to as "LLC Reference" and "LLC Misses" in the Intel
> SDM Table 18-1 and "Longest latency cache references/misses" in Table
> 19-1.

Btw, it warns us right:

"Because cache hierarchy, cache sizes and other implementation-specific
characteristics; value comparison to estimate performance differences is
not recommended."

> 
> > I could try to find better/more fitting event selectors on AMD...
>  
> If you've got any other suggestions, I'm all ears.

So there are no LLC events on AMD in the sense that there are no
event selectors which always mean last-level cache and select those
automagically, no matter whether the LLC is the L2, L3 and so on,
depending on the part.

If we have to be correct on AMD, we'd have to check whether the part has
an L3 and then choose the L3 events, say, something like

"EventSelect 4E1h L3 Cache Misses" and "EventSelect 4E2h L3 Fills caused
by L2 Evictions"

and if the LLC is the L2 (think client parts) then take the ones you've
selected.

I guess amd_pmu_event_map() could be taught to return the proper event
map depending on the part.

Now, the L3 detection could be carved out from some pieces in
arch/x86/kernel/cpu/intel_cacheinfo.c but I'd need to swap in all that
code again...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--