[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ9a7VjWKX-ugfM3VE6_ZfSooc7MHOFEWe=S_jAGHK+gyFUgfQ@mail.gmail.com>
Date: Wed, 28 Aug 2024 10:33:29 +0100
From: Mike Leach <mike.leach@...aro.org>
To: James Clark <james.clark@...aro.org>
Cc: Ganapatrao Kulkarni <gankulkarni@...amperecomputing.com>, Leo Yan <leo.yan@....com>,
scclevenger@...amperecomputing.com, acme@...hat.com,
coresight@...ts.linaro.org, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, darren@...amperecomputing.com,
james.clark@....com, suzuki.poulose@....com, Al.Grant@....com
Subject: Re: [PATCH] perf scripts python arm-cs-trace-disasm.py: Skip disasm
if address continuity is broken
Hi James,
On Fri, 23 Aug 2024 at 10:03, James Clark <james.clark@...aro.org> wrote:
>
>
>
> On 19/08/2024 11:59 am, Mike Leach wrote:
> > Hi,
> >
> > A new branch of OpenCSD is available - ocsd-consistency-checks-1.5.4-rc1
> >
> > Testing I managed to do confirms the N atom on unconditional branches
> > appear to work. I do not have a test case for the range
> > discontinuities.
> >
> > The checks are enabled using operation flags on decoder creation. See
> > the docs for details.
> >
> > Mike
> >
>
> Hi Mike,
>
> I tested the new OpenCSD and I don't see the error anymore in the
> disassembly script. I'm not sure if we need to go any further and add
> the backwards check, it looks like just a later symptom and the checks
> that you've added already prevent it.
>
The OCSD_OPFLG_CHK_RANGE_CONTINUE is the backwards address check - at
least as so far as is possible in OpenCSD.
What it checks is if the next range after a not taken branch starts at
the end address of the previous range. However this check is cancelled
if there are other packets that intervene - e.g. trace on / exceptions
/ anything that might imply a discontinuity.
The other caveat is that I did not have an example to see if the code
could actually get triggered - though I will go back and manually
trigger it in the debugger just to test functional correctness.
If you are still seeing backwards addresses after these changes then I
am not sure where they are coming from. It may be there is a missing
discontinuity somewhere that is not being flagged.
The other alternative that does occur to me now - thinking about
incorrect images, is if we incorrectly associate an atom with a direct
branch rather than an indirect branch.
For a direct branch the decoder will calculate the target and carry on
- not looking for an address update as it is not needed. Then when the
address update does arrive, it is used as a latest address and the
range address will be updated.
Unfortunately this would be difficult to test for - the decoder is
written to assume good trace and correct images - adding in code to
try to remember previous state and judge if something is wrong,
without getting false positives is difficult. It adds code complexity
that is not necessary for well behaved clients!
> If you release a new version I can send the perf patch. I was going to
> use these flags if that looks right to you? As far as I know that's the
> set that can be always on and won't fail on bad hardware?
>
That set of flags is fine -
> I also assumed that ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK can be given even
> for etmv3 and it's just a nop?
>
It is safe - as there is no flag for ETMv3 in the same slot.
Effectively decode flags have a common set of bits and decoder
specific set of bits that overlap for each decoder. We have not
really needed anything for ETMv3 to date, and I don't expect that to
change.
I'll get the new version released by the end of the week. I am off on
sabbatical for a month after that so any further investigation /
changes will have to wait
Regards
Mike
> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> index e917985bbbe6..90967fd807e6 100644
> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> @@ -685,9 +685,14 @@ cs_etm_decoder__create_etm_decoder(struct cs_etm_decoder_params *d_params,
> return 0;
>
> if (d_params->operation == CS_ETM_OPERATION_DECODE) {
> + int decode_flags = OCSD_CREATE_FLG_FULL_DECODER;
> +#ifdef OCSD_OPFLG_N_UNCOND_DIR_BR_CHK
> + decode_flags |= OCSD_OPFLG_N_UNCOND_DIR_BR_CHK | OCSD_OPFLG_CHK_RANGE_CONTINUE |
> + ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK;
> +#endif
> if (ocsd_dt_create_decoder(decoder->dcd_tree,
> decoder->decoder_name,
> - OCSD_CREATE_FLG_FULL_DECODER,
> + decode_flags,
> trace_config, &csid))
> return -1;
>
> > On Fri, 9 Aug 2024 at 16:20, James Clark <james.clark@...aro.org> wrote:
> >>
> >>
> >>
> >> On 09/08/2024 3:13 pm, Mike Leach wrote:
> >>> Hi James
> >>>
> >>> On Thu, 8 Aug 2024 at 10:32, James Clark <james.clark@...aro.org> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 07/08/2024 5:48 pm, Leo Yan wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> On 8/7/2024 3:53 PM, James Clark wrote:
> >>>>>
> >>>>> A minor suggestion: if the discussion is too long, please delete the
> >>>>> irrelevant message ;)
> >>>>>
> >>>>> [...]
> >>>>>
> >>>>>>> --- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
> >>>>>>> +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
> >>>>>>> @@ -257,6 +257,11 @@ def process_event(param_dict):
> >>>>>>> print("Stop address 0x%x is out of range [ 0x%x .. 0x%x
> >>>>>>> ] for dso %s" % (stop_addr, int(dso_start), int(dso_end), dso))
> >>>>>>> return
> >>>>>>>
> >>>>>>> + if (stop_addr < start_addr):
> >>>>>>> + if (options.verbose == True):
> >>>>>>> + print("Packet Dropped, Discontinuity detected
> >>>>>>> [stop_add:0x%x start_addr:0x%x ] for dso %s" % (stop_addr, start_addr,
> >>>>>>> dso))
> >>>>>>> + return
> >>>>>>> +
> >>>>>>
> >>>>>> I suppose my only concern with this is that it hides real errors and
> >>>>>> Perf shouldn't be outputting samples that go backwards. Considering that
> >>>>>> fixing this in OpenCSD and Perf has a much wider benefit I think that
> >>>>>> should be the ultimate goal. I'm putting this on my todo list for now
> >>>>>> (including Steve's merging idea).
> >>>>>
> >>>>> In the perf's util/cs-etm.c file, it handles DISCONTINUITY with:
> >>>>>
> >>>>> case CS_ETM_DISCONTINUITY:
> >>>>> /*
> >>>>> * The trace is discontinuous, if the previous packet is
> >>>>> * instruction packet, set flag PERF_IP_FLAG_TRACE_END
> >>>>> * for previous packet.
> >>>>> */
> >>>>> if (prev_packet->sample_type == CS_ETM_RANGE)
> >>>>> prev_packet->flags |= PERF_IP_FLAG_BRANCH |
> >>>>> PERF_IP_FLAG_TRACE_END;
> >>>>>
> >>>>> I am wandering if OpenCSD has passed the correct info so Perf decoder can
> >>>>> detect the discontinuity. If yes, then the flag 'PERF_IP_FLAG_TRACE_END' will
> >>>>> be set (it is a general flag in branch sample), then we can consider use it in
> >>>>> the python script to handle discontinuous data.
> >>>>
> >>>> No OpenCSD isn't passing the correct info here. Higher up in the thread
> >>>> I suggested an OpenCSD patch that makes it detect the error earlier and
> >>>> fixes the issue. It also needs to output a discontinuity when the
> >>>> address goes backwards. So two fixes and then the script works without
> >>>> modifications.
> >>>>
> >>>
> >>> Which address is going backwards here? - OpenCSD generates trace
> >>> ranges only by walking forwards from the last known address till it
> >>> hits a branch. Unless this wraps round 0x000000 this will never result
> >>> in a backwards address as far as I can see.
> >>> Do you have an example dump with OpenCSD outputting a range packet
> >>> with backwards addresses?
> >>>
> >>> Mike
> >>>
> >> The example I have I think is something like this:
> >>
> >> 1. Start address / trace on
> >> 2. E
> >> 3. Output range
> >> ...
> >> 4. Periodic address update
> >> ...
> >> 5. E
> >> 6. Output range
> >>
> >> If decode has gone wrong (but undetectably) between steps 1 and 3. Then
> >> the next steps still output a second range based on the last periodic
> >> address received. (I think it might not necessarily have to be a
> >> periodic address but could also be indirect address packet?). Perf
> >> converts the ranges into branch samples by taking the end of the first
> >> range and beginning of the second range. Then the disassembly script
> >> converts those samples into ranges again by taking the source and
> >> destination of the last two branch samples.
> >>
> >> The original issue that Ganapat saw was that the periodic address causes
> >> OpenCSD to put the source address of the second range somewhere before
> >> the first one, even though it didn't output a branch or discontinuity
> >> that would explain how it got there.
> >>
> >> But yes you're right the ranges themselves always go forwards from the
> >> point of view of their own start and end addresses.
> >>
> >> I thought it might be possible for OpenCSD to check against the last
> >> range output? Although I wasn't sure if maybe it's actually valid to do
> >> a backwards jump like that without the trace on/off packets with address
> >> filtering or something?
> >>
> >> The root cause is still the incorrect image, but I think this check
> >> along with the other direct branch check should make it pretty difficult
> >> for people to make the mistake.
> >
> >
> >
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Powered by blists - more mailing lists