[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <07c9b3be-676a-f7f8-d1d2-94ce3abcb61b@arm.com>
Date: Thu, 30 Nov 2023 10:07:15 +0000
From: James Clark <james.clark@....com>
To: Peter Zijlstra <peterz@...radead.org>,
Adrian Hunter <adrian.hunter@...el.com>
Cc: Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Heiko Carstens <hca@...ux.ibm.com>,
Thomas Richter <tmricht@...ux.ibm.com>,
Hendrik Brueckner <brueckner@...ux.ibm.com>,
Suzuki K Poulose <suzuki.poulose@....com>,
Mike Leach <mike.leach@...aro.org>, coresight@...ts.linaro.org,
linux-arm-kernel@...ts.infradead.org,
Yicong Yang <yangyicong@...ilicon.com>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
Will Deacon <will@...nel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org
Subject: Re: [PATCH RFC 2/3] perf/x86/intel/pt: Add support for pause_resume()
On 29/11/2023 12:23, Peter Zijlstra wrote:
> On Wed, Nov 29, 2023 at 01:15:43PM +0200, Adrian Hunter wrote:
>> On 29/11/23 12:58, Peter Zijlstra wrote:
>>> On Wed, Nov 29, 2023 at 09:53:39AM +0000, James Clark wrote:
>>>> On 23/11/2023 12:18, Adrian Hunter wrote:
>>>
>>>>> +static void pt_event_pause_resume(struct perf_event *event)
>>>>> +{
>>>>> + if (event->aux_paused)
>>>>> + pt_config_stop(event);
>>>>> + else if (!event->hw.state)
>>>>> + pt_config_start(event);
>>>>> +}
>>>>
>>>> It seems like having a single pause/resume callback rather than separate
>>>> pause and resume ones pushes some of the event state management into the
>>>> individual drivers and would be prone to code duplication and divergent
>>>> behavior.
>>>>
>>>> Would it be possible to move the conditions from here into the core code
>>>> and call separate functions instead?
>>>>
>>>>> +
>>>>> static void pt_event_start(struct perf_event *event, int mode)
>>>>> {
>>>>> struct hw_perf_event *hwc = &event->hw;
>>>>> @@ -1798,6 +1809,7 @@ static __init int pt_init(void)
>>>>> pt_pmu.pmu.del = pt_event_del;
>>>>> pt_pmu.pmu.start = pt_event_start;
>>>>> pt_pmu.pmu.stop = pt_event_stop;
>>>>> + pt_pmu.pmu.pause_resume = pt_event_pause_resume;
>>>>
>>>> The general idea seems ok to me. Is there a reason to not use the
>>>> existing start() stop() callbacks, rather than adding a new one?
>>>>
>>>> I assume it's intended to be something like an optimisation where you
>>>> can turn it on and off without having to do the full setup, teardown and
>>>> emit an AUX record because you know the process being traced never gets
>>>> switched out?
>>>
>>> So the actual scheduling uses ->add() / ->del(), the ->start() /
>>> ->stop() methods are something that can be used after ->add() and before
>>> ->del() to 'temporarily' pause things.
>>>
>>> Pretty much exactly what is required here I think. We currently use this
>>> for PMI throttling and adaptive frequency stuff, but there is no reason
>>> it could not also be used for this.
>>>
>>> As is, we don't track the paused state across ->del() / ->add(), but
>>> perhaps that can be fixed. We can easily add more PERF_EF_ / PERF_HES_
>>> bits to manage things.
>>>
>>>
>>
>> I am not sure stop / start play nice with NMI's from other events e.g.
>>
>> PMC NMI wants to pause or resume AUX but what if AUX event is currently
>> being processed in ->stop() or ->start()? Or maybe that can't happen?
>
> I think that can happen, and pt_event_stop() can actually handle some of
> that, while your pause_resume() thing, which uses pt_config_stop() does
> not.
>
> But yes, I think that if you add pt_event_{stop,start}() calls from
> *other* events their PMI, then you get to deal with more 'fun'.
>
> Something like:
>
> perf_addr_filters_adjust()
> __perf_addr_filters_adjust()
> perf_event_stop()
> __perf_event_stop()
> event->pmu->stop()
> <NMI>
> ...
> perf_event_overflow()
> pt_event->pmu->stop()
> </NMI>
> event->pmu->start() // whoopsie!
>
> Should now be possible.
>
> I think what you want to do is rename pt->handle_nmi into pt->stop_count
> and make it a counter, then ->stop() increments it, and ->start()
> decrements it and everybody ensures the thing doesn't get restart while
> !0 etc..
>
> I suspect you need to guard the generic part of this feature with a new
> PERF_PMU_CAP_ flag and then have the coresight/etc. people opt-in once
> they've audited things.
>
> James, does that work for you?
>
Yep I think that would work.
If I understand it with the stop_count counter, to be able to support
the new CAP, every device would basically have to implement a similar
counter?
Would it be possible to do that reference counting on the outside of
start() and stop() in the core code? So that a device only ever sees one
call to start and one call to stop and doesn't need to do any extra
tracking?
Powered by blists - more mailing lists