lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f436dc23-bcaf-493b-93a9-699b63a18d46@linaro.org>
Date: Wed, 25 Sep 2024 09:36:15 +0100
From: James Clark <james.clark@...aro.org>
To: Namhyung Kim <namhyung@...nel.org>
Cc: "Liang, Kan" <kan.liang@...ux.intel.com>,
 Arnaldo Carvalho de Melo <acme@...nel.org>, Ian Rogers <irogers@...gle.com>,
 Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
 Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>,
 LKML <linux-kernel@...r.kernel.org>, linux-perf-users@...r.kernel.org,
 Ravi Bangoria <ravi.bangoria@....com>, Mark Rutland <mark.rutland@....com>,
 James Clark <james.clark@....com>, Kajol Jain <kjain@...ux.ibm.com>,
 Thomas Richter <tmricht@...ux.ibm.com>, Atish Patra <atishp@...shpatra.org>,
 Palmer Dabbelt <palmer@...osinc.com>, Mingwei Zhang <mizhang@...gle.com>
Subject: Re: [PATCH 04/10] perf stat: Add --exclude-guest option



On 24/09/2024 9:21 pm, Namhyung Kim wrote:
> Hello,
> 
> On Mon, Sep 23, 2024 at 09:47:17AM +0100, James Clark wrote:
>>
>>
>> On 06/09/2024 3:33 pm, Liang, Kan wrote:
>>>
>>>
>>> On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
>>>> This option is to support the old behavior of setting exclude_guest by
>>>> default.  Now it doesn't set the bit so users want the old behavior can
>>>> use this option.
>>>>
>>>>     $ perf stat true
>>>>
>>>>      Performance counter stats for 'true':
>>>>
>>>>                   0.86 msec task-clock:u                     #    0.443 CPUs utilized
>>>>                      0      context-switches:u               #    0.000 /sec
>>>>                      0      cpu-migrations:u                 #    0.000 /sec
>>>>                     49      page-faults:u                    #   56.889 K/sec
>>>>                    ...
>>>>
>>>>     $ perf stat --exclude-guest true
>>>>
>>>>      Performance counter stats for 'true':
>>>>
>>>>                   0.79 msec task-clock:Hu                    #    0.490 CPUs utilized
>>>>                      0      context-switches:Hu              #    0.000 /sec
>>>>                      0      cpu-migrations:Hu                #    0.000 /sec
>>>>                     49      page-faults:Hu                   #   62.078 K/sec
>>>>                    ...
>>>>
>>>> Signed-off-by: Namhyung Kim <namhyung@...nel.org>
>>>> ---
>>>>    tools/perf/Documentation/perf-stat.txt | 7 +++++++
>>>>    tools/perf/builtin-stat.c              | 2 ++
>>>>    2 files changed, 9 insertions(+)
>>>>
>>>> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
>>>> index 2bc06367248691dd..d28d8370a856598f 100644
>>>> --- a/tools/perf/Documentation/perf-stat.txt
>>>> +++ b/tools/perf/Documentation/perf-stat.txt
>>>> @@ -382,6 +382,13 @@ color the metric's computed value.
>>>>    Don't print output, warnings or messages. This is useful with perf stat
>>>>    record below to only write data to the perf.data file.
>>>> +--exclude-guest::
>>>> +Don't count event in the guest mode.  It was the old behavior but the
>>>> +default is changed to count guest events also.  Use this option if you
>>>> +want the old behavior (host only).  Note that this option needs to be
>>>> +before other events in case you added -e/--event option in the command
>>>> +line.
>>>
>>> I'm not sure if we really need this option. I think it may bring more
>>> trouble than what we get.
>>>
>>> The name of the "--exclude-guest" sounds like a replacement of the event
>>> modifier "H". But in fact, it's not. It should only affect the default.
>>> It doesn't set the "H" for any events.
> 
> Well I think it's tricky but it'd set "H" modifier events after the
> option.  But I have to agree that it can bring more troubles.
> 
>>>
>>> Except for the perf kvm user, I don't think there are many users which
>>> care the exclude_guest. The behavior of the perf kvm is not changed. So
>>> the option seems not that important. If we really want an option to
>>> restore the old behavior, it's better to choose a better name and update
>>> the description.
> 
> Personally I don't want to this option but just worried if there's a
> case where exclude_guest is preferred.
> 
>>
>> Do we not want to keep exclude_guest for record, but remove it for stat?
> 
> What I really want is to synchronize record and stat behavior in terms
> of the default exclusion mode.  If everyone is fine, I'd like to remove
> exclude_guest from the default and set it only if needed (through the
> fallback) like we do for exclude_kernel.
> 
>>
>> Because in record the addresses of guest samples don't make sense without
>> extra work, but for stat you might want to see an overview of the whole
>> system.
> 
> I think it depends on the use case and (power) users should know about
> their environment and requirement.  The concern is what's the reasonable
> default, but I think it should be the same both in perf record and stat
> at least.
> 
>>
>> For Coresight tracing and SPE we would want to keep exclude_guest, otherwise
>> you generate a load of extra trace that you can't make use of. Say you were
>> doing PGO on your host you wouldn't be recompiling anything the guests were
>> running.
> 
> For the specific use case, I think we can guide users to add "H"
> modifier (I guess it's not the default event for perf record).
> Maybe we can consider per-PMU default attributes.
> 

Yeah I was thinking about adding it as a default when the Coresight and 
SPE events are configured, but I think it's too much to expect for 
people to know all the edge cases about what the per PMU defaults will 
be. Having one default for all PMUs makes more sense to me.

I suppose asking Coresight users to add :H when they need it might be a 
less bad thing than this change to clean up the fragmentation in the 
defaults.

I expect most of the time if :H is not added things continue to work, 
just that more data will be recorded. One real concern is that in some 
configurations there is a limited buffer size and once it's filled you 
don't get any more data. If that buffer is filled by guest trace then 
maybe some Auto FDO flow could break. But it's kind of an edge case of 
an edge case and adding :H isn't really the end of the world.

>>
>> If we do change the defaults isn't ':H' already enough to go back to the old
>> behavior? I'm wondering why we need an argument when all the other exclude
>> rules are done with the letter modifiers?
> 
> I'm not sure I follow this.  But maybe we don't need this option at all.
> We can add ":H" for every event but I'm too lazy to add them to all the
> default events in perf data. :)
> 
> Thanks,
> Namhyung
>   

Oh I see yes, the argument is a shortcut to adding H on all events 
and/or the default ones. I wasn't sure if was added because it couldn't 
be done with :H, but I see it's an alternative instead. So adding it 
makes sense.

>>>> +
>>>>    STAT RECORD
>>>>    -----------
>>>>    Stores stat data into perf data file.
>>>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>>>> index d8315dae930184ba..4d47675af5cc3094 100644
>>>> --- a/tools/perf/builtin-stat.c
>>>> +++ b/tools/perf/builtin-stat.c
>>>> @@ -2491,6 +2491,8 @@ int cmd_stat(int argc, const char **argv)
>>>>    		OPT_BOOLEAN_FLAG(0, "all-user", &stat_config.all_user,
>>>>    				"Configure all used events to run in user space.",
>>>>    				PARSE_OPT_EXCLUSIVE),
>>>> +		OPT_BOOLEAN(0, "exclude-guest", &exclude_HG_default,
>>>> +			"Don't count events in the guest mode"),

Maybe this would be a good hint if it really is equivalent:

  +	"Don't count events in the guest mode. (Add :H to all events)"),

>>>>    		OPT_BOOLEAN(0, "percore-show-thread", &stat_config.percore_show_thread,
>>>>    			"Use with 'percore' event qualifier to show the event "
>>>>    			"counts of one hardware thread by sum up total hardware "
>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ