linux-kernel - Re: [patch 21/24] perfmon: Intel architectural PMU support (x86)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.0811261656040.3325@localhost.localdomain>
Date:	Wed, 26 Nov 2008 17:10:27 +0100 (CET)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	eranian@...il.com
cc:	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	mingo@...e.hu, x86@...nel.org, andi@...stfloor.org,
	sfr@...b.auug.org.au
Subject: Re: [patch 21/24] perfmon: Intel architectural PMU support (x86)

On Wed, 26 Nov 2008, stephane eranian wrote:
> In anycase, the idea is to encapsulate as much as possible code
> related into a PMU model
> into each module. That is why you are seing some redundancy.

Makes sense.
 
> There is a difference between enable_mask and used_pmcs. The used_pmcs
> bitmasks shows
> all the config registers in use. Whereas enable_mask shows the all
> config registers which have
> start/stop capabilities. For the basic AMD64 PMU (4 counters)
> used_pmcs and enable_mask
> are equivalent, but that is not the case on Barcelona once we support
> IBS and sampling. So
> for now, I could clean this up and drop enable_mask to use plain used_pmcs.

Understood. If we need that in the near future then it's ok to keep
it, it just did not make any sense from the current code.

But I think you should do this once when you set up the context and
keep that as a separate mask. Right now you evaluate enable_mask and
used_pmcs over and over again.

> >> +     count = pfm_arch_bv_weight(used_mask, max_enable);
> >
> > So we have:
> >
> >   set->used_pmcs and enable_mask and max_enable.
> >
> > Why can set->used_pmcs contain bits which are not in the enable_mask
> > in the first place ? Why does the arch code not tell the generic code
> > which pmcs are available so we can avoid all this mask, weight
> > whatever magic ?
> >
> 
> Because used_pmcs is part of generic code and enable_mask is a x86 construct.
> As I said above, for now, I could drop enable_mask.
> The arch code already export the list of available pmcs and pmds in
> impl_pmcs and impl_pmds.

See above.
 
> > Why are the counters enabled at all when an overflow is pending, which
> > stopped the counters anyway ?
> >
> Because on Intel and AMD64, counters are not automatically frozen on interrupt.
> On Intel X86, they can be configured to do so, but it is an all or
> nothing setting.
> I am not using this option because we would then have a problem with the NMI
> watchdog given that it is also using a counter.

Well, my question was: why do we have to stop the counters when an
overflow is pending already ?

The overflow pending is set inside of stop_save() and cleared
somewhere else. 

stop_save() is called from pfm_arch_stop() and
pfm_arch_ctxswout_thread(). The first thing it does is to disable the
counters.

Now at some points the counters are obviously reenabled for this
context, but why are they reenabled _before_ the pending overflow has
been resolved ? For N counters that N * 2 wrmsrl() overhead.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/