linux-kernel - Re: [PATCH V6 11/14] perf/x86/intel: Disable sample-read the slots and metrics events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b24b9bd3-bbfb-98d4-4df3-c263e002dcf5@linux.intel.com>
Date:   Tue, 21 Jul 2020 12:07:29 -0400
From:   "Liang, Kan" <kan.liang@...ux.intel.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     acme@...hat.com, mingo@...nel.org, linux-kernel@...r.kernel.org,
        jolsa@...nel.org, eranian@...gle.com,
        alexander.shishkin@...ux.intel.com, ak@...ux.intel.com
Subject: Re: [PATCH V6 11/14] perf/x86/intel: Disable sample-read the slots
 and metrics events



On 7/21/2020 9:10 AM, Peter Zijlstra wrote:
> On Fri, Jul 17, 2020 at 07:05:51AM -0700, kan.liang@...ux.intel.com wrote:
>> From: Kan Liang <kan.liang@...ux.intel.com>
>>
>> Users fail to sample-read the slots and metrics events, e.g.,
>> perf record -e '{slots, topdown-retiring}:S'.
>>
>> When reading the metrics event, the fixed counter 3 (slots) has to be
>> reset, which impacts the sampling of the slots event.
>>
>> Add a specific validate_group() support to reject the case and error out
>> for Ice Lake.
>>
>> An alternative fix may unconditionally disable slots sampling, but it's
>> not a decent fix. Users may want to only sample the slot events
>> without the topdown metrics events.
>>
>> Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
> 
> I'm confused by this; it doesn't make sense.
> 
> Should not patch 7 have something like the below instead?
> > Also, I think there is a bug when we create a group like this and then
> kill the leader, in that case the core code will 'promote' the sibling
> metric events to their own individual events, see perf_group_detach().

I'm trying to produce the bug mentioned above, but I'm not sure under 
what situation, the core code will 'promote' the sibling metric events?

I tried the suggested code below. It works well for the sample-read 
case. Perf tool errors out as expected.

   perf record -e '{slots,topdown-fe-bound}:S' sleep 1
   Error:
   The sys_perf_event_open() syscall returned with 22 (Invalid argument) 
for event (topdown-fe-bound).
   /bin/dmesg | grep -i perf may provide additional information.

Thanks,
Kan

> 
> We need additional code to move those events into unrecoverable ERROR
> state. A new group_caps flag could indicate this promotion isn't
> allowed.
> 
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3441,8 +3441,22 @@ static int intel_pmu_hw_config(struct pe
>   	 * A flag PERF_X86_EVENT_TOPDOWN is applied for the case.
>   	 */
>   	if (x86_pmu.intel_cap.perf_metrics && is_topdown_event(event)) {
> -		if (is_metric_event(event) && is_sampling_event(event))
> -			return -EINVAL;
> +
> +		if (is_metric_event(event)) {
> +			struct perf_event *leader = event->group_leader;
> +
> +			if (is_sampling_event(event))
> +				return -EINVAL;
> +
> +			if (leader == event)
> +				return -EINVAL;
> +
> +			if (!is_slots_event(leader))
> +				return -EINVAL;
> +
> +			if (is_sampling_event(leader))
> +				return -EINVAL;
> +		}
>   
>   		if (!is_sampling_event(event)) {
>   			if (event->attr.config1 != 0)
>