[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50D6086E.4000402@gmail.com>
Date: Sat, 22 Dec 2012 12:22:22 -0700
From: David Ahern <dsahern@...il.com>
To: Ingo Molnar <mingo@...nel.org>,
Arnaldo Carvalho de Melo <acme@...radead.org>
CC: Linus Torvalds <torvalds@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [GIT PULL] perf changes for v3.8
any opinions on whether the approach is reasonable?
On 12/16/12 9:43 PM, David Ahern wrote:
> On 12/13/12 10:31 AM, Ingo Molnar wrote:
>> * Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>>> So the default shouldn't necessarily be "include guest". The default
>>> should presumably be "the user didn't say", and then the kernel does
>>> whatever works best.
>>>
>>> If the user actually explicitly says one or the other, we should try
>>> to honor that (and then EOPNOTSUPP may be a "sorry, I really cannot do
>>> that particular combination that you explicitly asked for").
>>>
>>> That should make everybody happy. Doing a non-PEBS virtualized perf
>>> run should still work with the old binary.
>>>
>>> So there should be two bits: "include guest" (V in the event specifier
>>> unless you already used that for something else) and "host only" (H),
>>> and they should both default to off. Then the kernel can see the three
>>> actual cases.
>>>
>>> (Or four cases, if you really want to: you may or may not want to make
>>> the "both V and H set means both, and _only_ V set means 'no host at
>>> all, _only_ virtual environment'. So then ":ppV" would mean
>>> "cycle-accurate for virtual box _only_", while ":ppVH" would mean
>>> "cycle-accurate for both the host and the virtual box". Of course,
>>> considering the PEBS interface, right now neither of those can
>>> actually work, but plain ":V" and ":HV" could work).
>>>
>>> The important thing, I think, is that if the user doesn't know
>>> or care about the VM case (because he's not running any!) and
>>> doesn't specify, then the kernel should not say EOPNOTSUPP,
>>> and should do whatever works for that cpu.
>>
>> Agreed.
>>
>> David, wanna send a patch for this?
>
> As I mentioned in a prior email exclude_{guest,host} work currently work
> fine without PEBS. The current matrix for the flags:
>
> profiling
> guest host
> -e <event> y y
> -e <event>:G y n - G means enable guest, turn off host
> -e <event>:H n y - H means enable host, turn off guest
> -e <event>:GH y y - G followed by H means enable both
> -e <event>:HG y y - same as GH
>
> There is no reason to change how these work. It's the variants with :p
> that need to be handled:
>
> -e <event>:p n y - guest off is required
> -e <event>:pG y n - needs to fail - not supported
> -e <event>:pH n y
> -e <event>:pGH y y - needs to fail - not supported
>
> This is the logic that was implemented in the original patchset which
> was pulled into v3.7 and the cause of this email thread.
>
> One suggestion was to switch exclude_guest to include_guest. I take that
> to mean deprecate the current exclude_guest and add a new include_guest
> flag. Given that there are a number of exclude_XXXX flags (XXXX = user,
> kernel, host, guest, hv, etc) that would make the perf code inconsistent.
>
> All that is needed is for the current exclude_guest flag to be
> deprecated such that for older binaries on newer kernels it is ignored
> (perhaps a warn on once), and then a new flag -- exclude_guest2 -- is
> then used for the new logic.
>
> e.g.,
>
> diff --git a/include/uapi/linux/perf_event.h
> b/include/uapi/linux/perf_event.h
> index 4f63c05..19900df 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -266,12 +266,14 @@ struct perf_event_attr {
> sample_id_all : 1, /* sample_type all events */
>
> exclude_host : 1, /* don't count in host */
> - exclude_guest : 1, /* don't count in guest */
> + exclude_guest : 1, /* don't count in guest -
> DEPRECATED */
>
> exclude_callchain_kernel : 1, /* exclude kernel
> callchains */
> exclude_callchain_user : 1, /* exclude user
> callchains */
>
> - __reserved_1 : 41;
> + exclude_guest2 : 1, /* don't count in guest */
> +
> + __reserved_1 : 40;
>
> union {
> __u32 wakeup_events; /* wakeup every n events */
>
>
> Do you agree with that?
>
> David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists