linux-kernel - Re: [BUG] perf stat: explicit grouping yields unexpected results

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABPqkBTVKBEU=vnGBK8MibSfVftRn++ftzLC4SB69O-LSaECtw@mail.gmail.com>
Date:	Fri, 29 Nov 2013 15:01:21 +0100
From:	Stephane Eranian <eranian@...gle.com>
To:	Jiri Olsa <jolsa@...hat.com>
Cc:	Andi Kleen <ak@...ux.intel.com>, Ingo Molnar <mingo@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	"mingo@...e.hu" <mingo@...e.hu>, David Ahern <dsahern@...il.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Namhyung Kim <namhyung.kim@....com>
Subject: Re: [BUG] perf stat: explicit grouping yields unexpected results

On Fri, Nov 29, 2013 at 2:52 PM, Jiri Olsa <jolsa@...hat.com> wrote:
> On Fri, Nov 29, 2013 at 02:43:35PM +0100, Stephane Eranian wrote:
>> On Fri, Nov 29, 2013 at 2:33 PM, Jiri Olsa <jolsa@...hat.com> wrote:
>> > On Sat, Nov 16, 2013 at 07:41:34PM -0800, Andi Kleen wrote:
>> >> > I'd say that the default behavior should be what Jiri implemented: get
>> >> > the most out of the situation and inform. But you are right in that
>> >> > 'forcing' all elements of a group to be valid should be possible as
>> >> > well - if a special perf stat option or event format is used.
>> >>
>> >> When something is multiplexed it can have a very
>> >> large measurement error. For workloads that fluctuate quite a bit, and the
>> >> fluctuations do not line up well with the multiplexing interval,
>> >> the default scaling does not give good results.
>> >>
>> >> So you expect to get good data, but you get very bad data.
>> >>
>> >> When collecting data for a large number of events it is important
>> >> to group them correctly, so that events that are directly dependent
>> >> on each other in equations are properly grouped.
>> >>
>> >> When explicit groups were added the user likely considered this
>> >> problem, so it's not good to silently override the choices.
>> >>
>> >> If a user doesn't care they can always not use groups.
>> >>
>> >> > Even in that second case it shouldn't say <unsupported> for everything
>> >> > in the result, but should deny the run immediately and return with an
>> >> > error, and should tell the user how many events in the group fit and
>> >> > which ones didn't.
>> >>
>> >> Returning this information would be great, but it would really
>> >> need an extended errno, or just a error string reported out.
>> >
>> > (sry for late reply, I was still ooo, and missed this conversation)
>> >
>> > I agree, when the last event fails sys_perf_event_open
>> > due to the validate_group check, we will get just EINVAL
>> >
>> > Was there any discussion about the error (or erorr string)
>> > propagation from sys_perf_event_open?
>> >
>> > Something like below? user space supply buffer for error string.
>> >
>> No. Why do you need kernel changes for that.
>> Perf gets the error, knows it is grouping and prints an appropriate
>
> how does perf know it's grouping and not something else?
>
Because the group_fd on this syscall is not -1.

>> error message. Why do you need kernel for this?
>
> like how would you differentiate EINVAL from validate_group or say
> from set_ext_hw_attr (got by using unsupported cache event) ?
>
If you cannot, simply abort and print something like:
if (group_fd != -1 && ret == EINVAL)
  warnx("cannot program event X in group. You may be overcommitting an
event group, try
            reducing the number of events/group"

Or something like that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/