linux-kernel - Re: [PATCH] perf record: Add snapshot mode support for perf's regular events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <56553022.8000101@huawei.com>
Date:	Wed, 25 Nov 2015 11:50:58 +0800
From:	"Wangnan (F)" <wangnan0@...wei.com>
To:	Arnaldo Carvalho de Melo <acme@...nel.org>,
	David Ahern <dsahern@...il.com>
CC:	Yunlong Song <yunlong.song@...wei.com>, <a.p.zijlstra@...llo.nl>,
	<paulus@...ba.org>, <mingo@...hat.com>,
	<linux-kernel@...r.kernel.org>, <namhyung@...nel.org>,
	<ast@...nel.org>, <masami.hiramatsu.pt@...achi.com>,
	<kan.liang@...el.com>, <adrian.hunter@...el.com>,
	<jolsa@...nel.org>, <bp@...en8.de>, <jean.pihet@...aro.org>,
	<rric@...nel.org>, <xiakaixu@...wei.com>, <hekuang@...wei.com>
Subject: Re: [PATCH] perf record: Add snapshot mode support for perf's regular
 events

On 2015/11/24 23:20, Arnaldo Carvalho de Melo wrote:
> Em Tue, Nov 24, 2015 at 08:06:41AM -0700, David Ahern escreveu:
>> On 11/24/15 7:00 AM, Yunlong Song wrote:
>>> +static int record__write(struct record *rec, void *bf, size_t size)
>>> +{
>>> +	if (rec->memory.size && memory_enabled) {
>>> +		if (perf_memory__write(&rec->memory, bf, size) < 0) {
>>> +			pr_err("failed to write memory data, error: %m\n");
>>> +			return -1;
>>> +		}
>>> +	} else {
>>> +		if (perf_data_file__write(rec->session->file, bf, size) < 0) {
>>> +			pr_err("failed to write perf data, error: %m\n");
>>> +			return -1;
>>> +		}
>>> +		rec->bytes_written += size;
>>>   	}
>>>
>>> -	rec->bytes_written += size;
>>>   	return 0;
>>>   }
>>>
>>> @@ -86,6 +214,8 @@ static int record__mmap_read(struct record *rec, int idx)
>>>   	if (old == head)
>>>   		return 0;
>>>
>>> +	memory_enabled = 1;
>>> +
>>>   	rec->samples++;
>>>
>>>   	size = head - old;
>>> @@ -113,6 +243,7 @@ static int record__mmap_read(struct record *rec, int idx)
>>>   	md->prev = old;
>>>   	perf_evlist__mmap_consume(rec->evlist, idx);
>>>   out:
>>> +	memory_enabled = 0;
>>>   	return rc;
>>>   }
>>>
>> So you are basically ignoring all samples until SIGUSR2 is received. That
> No, he is not, its just that his code is difficult to follow, has to be
> rewritten, but he is ignoring just PERF_RECORD_SAMPLE events, so it
> will..
>
>> means the resulting data file will have limited history of task events for
> ... have a complete history of task events, since PERF_RECORD_FORK, etc
> are not being ignored.
>
> No?

Actually we are discussing about this problem.

For such tracking events (PERF_RECORD_FORK...), we have dummy event so
it is possible for us to receive tracking events from a separated
channel, therefore we don't have to parse every events to pick those
events out. Instead, we can process tracking events differently, then
more interesting things can be done. For example, squashing those tracking
events if it takes too much memory...

Furthermore, there's another problem being discussed: if userspace 
ringbuffer
is bytes based, parsing event is unavoidable. Without parsing event we are
unable to find the new 'head' pointer when overwriting. Instead, we are
thinking about a bucket-based ringbuffer that, let perf maintain a series
of bucket, each time 'poll' return, perf copies new events to the start of
a bucket. If all bucket is occupied, we drop the oldest bucket. Bucket-based
ringbuffer watest some memory but can avoid event parsing.

And there's many other problems in this patch. For example, when SIGUSR2 is
received, we need to do something to let all perf events start dumping.
Current implementation can't ensure we receive events just before the
SIGUSR2 if we not set 'no-buffer'.

Also, output events are in one perf.data, which is not user friendly.
Our final goal is to make perf a daemonized moniter, which can run 7x24
in user's environment. Each time a glitch is detected, a framework sends
a signal to perf to get a perf.data from it perf. The framework manage
those perf.data like logrotate, help developer analysis those glitch.

We are seeking the route implementing the final monitor. This patch is
an attempt to let you know what we want and get your thought about it.
Looks like you agree out basic idea. That's good. Then we decide to
start from some small feature to support the final goal. For example:
snapshot mode for specific events:

  # perf record -a -e cycles/snapshot/

And when C-c is pressed, for cycles event, only those data still in
kernel would be dump.

Thank you.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/