lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 11 Jun 2021 14:55:35 +0100
From:   James Clark <james.clark@....com>
To:     Leo Yan <leo.yan@...aro.org>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mathieu Poirier <mathieu.poirier@...aro.org>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Mike Leach <mike.leach@...aro.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        John Garry <john.garry@...wei.com>,
        Will Deacon <will@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Daniel Kiss <daniel.kiss@....com>,
        Denis Nikitin <denik@...gle.com>, coresight@...ts.linaro.org,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v1 1/3] coresight: etm-perf: Correct buffer syncing for
 snapshot



On 10/06/2021 17:54, James Clark wrote:
> 
> 
> On 01/06/2021 13:35, Leo Yan wrote:
>> Hi James,
>>
>> On Tue, Jun 01, 2021 at 12:53:16PM +0300, James Clark wrote:
>>
>> [...]
>>
>>> Hi Leo,
>>>
>>> I was testing out snapshot mode (without your patch) and I noticed that it
>>> only ever collects from the last CPU. For example on a 4 core system,
>>> the CPU ID of the AUX records and the AUXTRACE buffers is always 3.
>>>
>>> This is with systemwide tracing, and running "stress -m 2 -c 2".
>>> Is this something that your patch fixes, or am I doing something wrong, or
>>> is it just a coincidence?
>>
>> No, I think it's quite likely caused by blow code:
>>
>> static unsigned long
>> tmc_update_etr_buffer(struct coresight_device *csdev,
>>                       struct perf_output_handle *handle,
>>                       void *config)
>> {
>>     unsigned long flags, offset, size = 0;
>>
>>     ...
>>
>>     /* Don't do anything if another tracer is using this sink */
>>     if (atomic_read(csdev->refcnt) != 1) {
>>         spin_unlock_irqrestore(&drvdata->spinlock, flags);
>>         goto out;
>>     }
>>
>>     ...
>>
>>     return size;
>> }
>>
>> When using the system wide tracing, it updates the AUX ring buffer
>> until the last tracer is stopped.  Thus whis is why it only records
>> AUX ring buffer for the last CPU.
>>
>> But this makes sense for me, this is because the last CPU is used to
>> copy trace data to AUX ring buffer (so the perf event PERF_RECORD_AUX
>> occurs on CPU3), but when you decode the trace data, you should can
>> see the activities from other CPUs.
>>
>> Thanks,
>> Leo
>>
> 
> I have one more issue around snapshot mode which I'd like to check if it's
> related to this patchset.
> 
> In "[PATCH v5 0/1] perf cs-etm: Split Coresight decode by aux records",
> I added the warning for a missing AUXTRACE buffer as you suggested.
> Now this warning gets triggered on the last AUX record when using
> snapshot mode. I don't know if you are able to reproduce this.
> 

Hi Leo,

So I tested this set with my aux split patch, and I still get the warning about the
last AUX record not being found, so this is an independent issue. Whether
it could be (or needs to be fixed) at the same time, I'm not sure.

One thing seems to have improved with your set is the number of aux records
produced. For each SIGUSR2, I now get one aux record. Previously I got a random
number that didn't seem to match up, which didn't seem right to me.

But one thing that could be worse is the offsets and sizes. Now the size of the
aux records is always 0x100000, no matter what the -m arguments are, and the size
of the AUX record exceeds the size of the buffer. This makes the split patchset
fall back to processing the whole buffer because it never finds a buffer that the
AUX record fits in. For example:

	0 0 0xad0 [0x30]: PERF_RECORD_AUXTRACE size: 0x20000  offset: 0xe0000  ref: 0x406971eb0346c919  idx: 2  tid: 6794  cpu: 2
	2 5644375601060 0xa88 [0x40]: PERF_RECORD_AUX offset: 0x100000 size: 0x100000 flags: 0x2 [O]

The buffer is only 0x20000 long, but the aux record is 0x100000.

Another issue is that now the offsets don't match up:
	0 0 0xad0 [0x30]: PERF_RECORD_AUXTRACE size: 0x20000
  offset: 0xe0000  ref: 0x406971eb0346c919  idx: 2  tid: 6794  cpu: 2
	2 5644375601060 0xa88 [0x40]: PERF_RECORD_AUX offset: 0x100000 size: 0x100000 flags: 0x2 [O]
	0 0 0x20bb0 [0x30]: PERF_RECORD_AUXTRACE size: 0x20000  offset: 0x1e0000  ref: 0x1a3abb5c2407536a  idx: 2  tid: 6794  cpu: 2
	2 5644942278600 0x20b68 [0x40]: PERF_RECORD_AUX offset: 0x200000 size: 0x100000 flags: 0x2 [O]
	0 0 0x40c90 [0x30]: PERF_RECORD_AUXTRACE size: 0x20000  offset: 0x2e0000  ref: 0x7477d3d43d0a2ac7  idx: 2  tid: 6794  cpu: 2
	2 5645263266480 0x40c48 [0x40]: PERF_RECORD_AUX offset: 0x300000 size: 0x100000 flags: 0x2 [O]
	0 0 0x60d70 [0x30]: PERF_RECORD_AUXTRACE size: 0x20000  offset: 0x3e0000  ref: 0x202b939740c4f041  idx: 2  tid: 6794  cpu: 2
	2 5645569318020 0x60d28 [0x40]: PERF_RECORD_AUX offset: 0x400000 size: 0x100000 flags: 0x2 [O]

The buffers are offset by 0xe0000, but the aux records are on round 0x100000 offsets.

Maybe we need to re-think the aux split patchset, do we not need to split in snapshot mode? Or can we fix this
set to produce aux records that match up?

James

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ