lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <af15ce5e-2344-4537-aace-ea3615d9607e@linaro.org>
Date: Thu, 29 Aug 2024 14:35:28 +0100
From: James Clark <james.clark@...aro.org>
To: Mike Leach <mike.leach@...aro.org>,
 Ganapatrao Kulkarni <gankulkarni@...amperecomputing.com>
Cc: Leo Yan <leo.yan@....com>, scclevenger@...amperecomputing.com,
 acme@...hat.com, coresight@...ts.linaro.org,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 darren@...amperecomputing.com, james.clark@....com, suzuki.poulose@....com,
 Al.Grant@....com
Subject: Re: [PATCH] perf scripts python arm-cs-trace-disasm.py: Skip disasm
 if address continuity is broken



On 28/08/2024 10:33 am, Mike Leach wrote:
> Hi James,
> 
> On Fri, 23 Aug 2024 at 10:03, James Clark <james.clark@...aro.org> wrote:
>>
>>
>>
>> On 19/08/2024 11:59 am, Mike Leach wrote:
>>> Hi,
>>>
>>> A new branch of OpenCSD is available - ocsd-consistency-checks-1.5.4-rc1
>>>
>>> Testing I managed to do confirms the N atom on unconditional branches
>>> appear to work. I do not have a test case for the range
>>> discontinuities.
>>>
>>> The checks are enabled using operation flags on decoder creation. See
>>> the docs for details.
>>>
>>> Mike
>>>
>>
>> Hi Mike,
>>
>> I tested the new OpenCSD and I don't see the error anymore in the
>> disassembly script. I'm not sure if we need to go any further and add
>> the backwards check, it looks like just a later symptom and the checks
>> that you've added already prevent it.
>>
> 
> The OCSD_OPFLG_CHK_RANGE_CONTINUE is the backwards address check - at
> least as so far as is possible in OpenCSD.
> What it checks is if the next range after a not taken branch starts at
> the end address of the previous range. However this check is cancelled
> if there are other packets that intervene - e.g. trace on / exceptions
> / anything that might imply a discontinuity.
> 
> The other caveat is that I did not have an example to see if the code
> could actually get triggered - though I will go back and manually
> trigger it in the debugger just to test functional correctness.
> 
> If you are still seeing backwards addresses after these changes then I
> am not sure where they are coming from. It may be there is a missing
> discontinuity somewhere that is not being flagged.

I tracked down this issue, so there are two issues now:

   #1 Using vmlinux which is a bad image, but is
      fixed by your OpenCSD bad image detection changes.

   #2 With Ganapat's kcore which should be the correct image, the issue
      is in Perf. There is a bug in the handling of a full packet_queue
      resulting in it setting the previous branch destination rather than
      the next one for the last sample in the queue.

I should be able to send a patch for this. I'll also try to add a test 
because this decode script seems like a good place to catch bugs.


> 
> The other alternative that does occur to me now - thinking about
> incorrect images, is if we incorrectly associate an atom with a direct
> branch rather than an indirect branch.
> For a direct branch the decoder will calculate the target and carry on
> - not looking for an address update as it is not needed. Then when the
> address update does arrive, it is used as a latest address and the
> range address will be updated.

Yeah this is the backwards address issue I was thinking of, but with the 
other OpenCSD changes I don't think there is an example of it, so we can 
probably hold off for now. Especially if it's difficult to test for.

> 
> Unfortunately this would be difficult to test for - the decoder is
> written to assume good trace and correct images - adding in code to
> try to remember previous state and judge if something is wrong,
> without getting false positives is difficult. It adds code complexity
> that is not necessary for well behaved clients!
> 
>> If you release a new version I can send the perf patch. I was going to
>> use these flags if that looks right to you? As far as I know that's the
>> set that can be always on and won't fail on bad hardware?
>>
> 
> That set of flags is fine -
> 
>> I also assumed that ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK can be given even
>> for etmv3 and it's just a nop?
>>
> 
> It is safe - as there is no flag for ETMv3 in the same slot.
> Effectively decode flags have a common set of bits and decoder
> specific set of bits that overlap for each decoder.  We have not
> really needed anything for ETMv3 to date, and I don't expect that to
> change.
> 
> 
> I'll get the new version released by the end of the week. I am off on
> sabbatical for a month after that so any further investigation /
> changes will have to wait
> 
> Regards
> 
> Mike
> 
> 
>> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> index e917985bbbe6..90967fd807e6 100644
>> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> @@ -685,9 +685,14 @@ cs_etm_decoder__create_etm_decoder(struct cs_etm_decoder_params *d_params,
>>                   return 0;
>>
>>           if (d_params->operation == CS_ETM_OPERATION_DECODE) {
>> +               int decode_flags = OCSD_CREATE_FLG_FULL_DECODER;
>> +#ifdef OCSD_OPFLG_N_UNCOND_DIR_BR_CHK
>> +               decode_flags |= OCSD_OPFLG_N_UNCOND_DIR_BR_CHK | OCSD_OPFLG_CHK_RANGE_CONTINUE |
>> +                               ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK;
>> +#endif
>>                   if (ocsd_dt_create_decoder(decoder->dcd_tree,
>>                                              decoder->decoder_name,
>> -                                          OCSD_CREATE_FLG_FULL_DECODER,
>> +                                          decode_flags,
>>                                              trace_config, &csid))
>>                           return -1;
>>
>>> On Fri, 9 Aug 2024 at 16:20, James Clark <james.clark@...aro.org> wrote:
>>>>
>>>>
>>>>
>>>> On 09/08/2024 3:13 pm, Mike Leach wrote:
>>>>> Hi James
>>>>>
>>>>> On Thu, 8 Aug 2024 at 10:32, James Clark <james.clark@...aro.org> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 07/08/2024 5:48 pm, Leo Yan wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> On 8/7/2024 3:53 PM, James Clark wrote:
>>>>>>>
>>>>>>> A minor suggestion: if the discussion is too long, please delete the
>>>>>>> irrelevant message ;)
>>>>>>>
>>>>>>> [...]
>>>>>>>
>>>>>>>>> --- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
>>>>>>>>> +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
>>>>>>>>> @@ -257,6 +257,11 @@ def process_event(param_dict):
>>>>>>>>>                      print("Stop address 0x%x is out of range [ 0x%x .. 0x%x
>>>>>>>>> ] for dso %s" % (stop_addr, int(dso_start), int(dso_end), dso))
>>>>>>>>>                      return
>>>>>>>>>
>>>>>>>>> +       if (stop_addr < start_addr):
>>>>>>>>> +               if (options.verbose == True):
>>>>>>>>> +                       print("Packet Dropped, Discontinuity detected
>>>>>>>>> [stop_add:0x%x start_addr:0x%x ] for dso %s" % (stop_addr, start_addr,
>>>>>>>>> dso))
>>>>>>>>> +               return
>>>>>>>>> +
>>>>>>>>
>>>>>>>> I suppose my only concern with this is that it hides real errors and
>>>>>>>> Perf shouldn't be outputting samples that go backwards. Considering that
>>>>>>>> fixing this in OpenCSD and Perf has a much wider benefit I think that
>>>>>>>> should be the ultimate goal. I'm putting this on my todo list for now
>>>>>>>> (including Steve's merging idea).
>>>>>>>
>>>>>>> In the perf's util/cs-etm.c file, it handles DISCONTINUITY with:
>>>>>>>
>>>>>>>        case CS_ETM_DISCONTINUITY:
>>>>>>>             /*
>>>>>>>              * The trace is discontinuous, if the previous packet is
>>>>>>>              * instruction packet, set flag PERF_IP_FLAG_TRACE_END
>>>>>>>              * for previous packet.
>>>>>>>              */
>>>>>>>             if (prev_packet->sample_type == CS_ETM_RANGE)
>>>>>>>                     prev_packet->flags |= PERF_IP_FLAG_BRANCH |
>>>>>>>                                           PERF_IP_FLAG_TRACE_END;
>>>>>>>
>>>>>>> I am wandering if OpenCSD has passed the correct info so Perf decoder can
>>>>>>> detect the discontinuity. If yes, then the flag 'PERF_IP_FLAG_TRACE_END' will
>>>>>>> be set (it is a general flag in branch sample), then we can consider use it in
>>>>>>> the python script to handle discontinuous data.
>>>>>>
>>>>>> No OpenCSD isn't passing the correct info here. Higher up in the thread
>>>>>> I suggested an OpenCSD patch that makes it detect the error earlier and
>>>>>> fixes the issue. It also needs to output a discontinuity when the
>>>>>> address goes backwards. So two fixes and then the script works without
>>>>>> modifications.
>>>>>>
>>>>>
>>>>> Which address is going backwards here? - OpenCSD generates trace
>>>>> ranges only by walking forwards from the last known address till it
>>>>> hits a branch. Unless this wraps round 0x000000 this will never result
>>>>> in a backwards address as far as I can see.
>>>>> Do you have an example dump with OpenCSD outputting a range packet
>>>>> with backwards addresses?
>>>>>
>>>>> Mike
>>>>>
>>>> The example I have I think is something like this:
>>>>
>>>> 1.  Start address / trace on
>>>> 2.  E
>>>> 3.  Output range
>>>>        ...
>>>> 4.  Periodic address update
>>>>        ...
>>>> 5.  E
>>>> 6.  Output range
>>>>
>>>> If decode has gone wrong (but undetectably) between steps 1 and 3. Then
>>>> the next steps still output a second range based on the last periodic
>>>> address received. (I think it might not necessarily have to be a
>>>> periodic address but could also be indirect address packet?). Perf
>>>> converts the ranges into branch samples by taking the end of the first
>>>> range and beginning of the second range. Then the disassembly script
>>>> converts those samples into ranges again by taking the source and
>>>> destination of the last two branch samples.
>>>>
>>>> The original issue that Ganapat saw was that the periodic address causes
>>>> OpenCSD to put the source address of the second range somewhere before
>>>> the first one, even though it didn't output a branch or discontinuity
>>>> that would explain how it got there.
>>>>
>>>> But yes you're right the ranges themselves always go forwards from the
>>>> point of view of their own start and end addresses.
>>>>
>>>> I thought it might be possible for OpenCSD to check against the last
>>>> range output? Although I wasn't sure if maybe it's actually valid to do
>>>> a backwards jump like that without the trace on/off packets with address
>>>> filtering or something?
>>>>
>>>> The root cause is still the incorrect image, but I think this check
>>>> along with the other direct branch check should make it pretty difficult
>>>> for people to make the mistake.
>>>
>>>
>>>
> 
> 
> 
> --
> Mike Leach
> Principal Engineer, ARM Ltd.
> Manchester Design Centre. UK

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ