lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50CEA2F7.7020908@gmail.com>
Date:	Sun, 16 Dec 2012 21:43:35 -0700
From:	David Ahern <dsahern@...il.com>
To:	Ingo Molnar <mingo@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...radead.org>
CC:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [GIT PULL] perf changes for v3.8

On 12/13/12 10:31 AM, Ingo Molnar wrote:
> * Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>> So the default shouldn't necessarily be "include guest". The default
>> should presumably be "the user didn't say", and then the kernel does
>> whatever works best.
>>
>> If the user actually explicitly says one or the other, we should try
>> to honor that (and then EOPNOTSUPP may be a "sorry, I really cannot do
>> that particular combination that you explicitly asked for").
>>
>> That should make everybody happy. Doing a non-PEBS virtualized perf
>> run should still work with the old binary.
>>
>> So there should be two bits: "include guest" (V in the event specifier
>> unless you already used that for something else) and "host only" (H),
>> and they should both default to off. Then the kernel can see the three
>> actual cases.
>>
>> (Or four cases, if you really want to: you may or may not want to make
>> the "both V and H set means both, and _only_ V set means 'no host at
>> all, _only_ virtual environment'. So then ":ppV" would mean
>> "cycle-accurate for virtual box _only_", while ":ppVH" would mean
>> "cycle-accurate for both the host and the virtual box". Of course,
>> considering the PEBS interface, right now neither of those can
>> actually work, but plain ":V" and ":HV" could work).
>>
>> The important thing, I think, is that if the user doesn't know
>> or care about the VM case (because he's not running any!) and
>> doesn't specify, then the kernel should not say EOPNOTSUPP,
>> and should do whatever works for that cpu.
>
> Agreed.
>
> David, wanna send a patch for this?

As I mentioned in a prior email exclude_{guest,host} work currently work 
fine without PEBS. The current matrix for the flags:

                    profiling
                   guest    host
-e <event>         y        y
-e <event>:G       y        n    - G means enable guest, turn off host
-e <event>:H       n        y    - H means enable host, turn off guest
-e <event>:GH      y        y    - G followed by H means enable both
-e <event>:HG      y        y    - same as GH

There is no reason to change how these work. It's the variants with :p 
that need to be handled:

-e <event>:p       n        y    - guest off is required
-e <event>:pG      y        n    - needs to fail - not supported
-e <event>:pH      n        y
-e <event>:pGH     y        y    - needs to fail - not supported

This is the logic that was implemented in the original patchset which 
was pulled into v3.7 and the cause of this email thread.

One suggestion was to switch exclude_guest to include_guest. I take that 
to mean deprecate the current exclude_guest and add a new include_guest 
flag. Given that there are a number of exclude_XXXX flags (XXXX = user, 
kernel, host, guest, hv, etc) that would make the perf code inconsistent.

All that is needed is for the current exclude_guest flag to be 
deprecated such that for older binaries on newer kernels it is ignored 
(perhaps a warn on once), and then a new flag -- exclude_guest2 -- is 
then used for the new logic.

e.g.,

diff --git a/include/uapi/linux/perf_event.h 
b/include/uapi/linux/perf_event.h
index 4f63c05..19900df 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -266,12 +266,14 @@ struct perf_event_attr {
                 sample_id_all  :  1, /* sample_type all events */

                 exclude_host   :  1, /* don't count in host   */
-               exclude_guest  :  1, /* don't count in guest  */
+               exclude_guest  :  1, /* don't count in guest - DEPRECATED */

                 exclude_callchain_kernel : 1, /* exclude kernel 
callchains */
                 exclude_callchain_user   : 1, /* exclude user callchains */

-               __reserved_1   : 41;
+               exclude_guest2  :  1, /* don't count in guest  */
+
+               __reserved_1   : 40;

     union {
         __u32       wakeup_events;    /* wakeup every n events */


Do you agree with that?

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ