linux-kernel - Re: [PATCH v5 07/10] perf record: implement -z,--compression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <002e7e10-b0ef-df2a-261c-88fd9c00364d@linux.intel.com>
Date:   Thu, 7 Mar 2019 18:26:47 +0300
From:   Alexey Budankov <alexey.budankov@...ux.intel.com>
To:     Jiri Olsa <jolsa@...hat.com>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Andi Kleen <ak@...ux.intel.com>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5 07/10] perf record: implement -z,--compression_level=n
 option and compression


On 07.03.2019 15:14, Jiri Olsa wrote:
> On Thu, Mar 07, 2019 at 11:39:46AM +0300, Alexey Budankov wrote:
>>
>> On 05.03.2019 15:25, Jiri Olsa wrote:
>>> On Fri, Mar 01, 2019 at 06:58:32PM +0300, Alexey Budankov wrote:
>>>
>>> SNIP
>>>
>>>>  
>>>>  	/*
>>>>  	 * Increment md->refcount to guard md->data[idx] buffer
>>>> @@ -350,7 +357,7 @@ int perf_mmap__aio_push(struct perf_mmap *md, void *to, int idx,
>>>>  	md->prev = head;
>>>>  	perf_mmap__consume(md);
>>>>  
>>>> -	rc = push(to, &md->aio.cblocks[idx], md->aio.data[idx], size0 + size, *off);
>>>> +	rc = push(to, md->aio.data[idx], size0 + size, *off, &md->aio.cblocks[idx]);
>>>>  	if (!rc) {
>>>>  		*off += size0 + size;
>>>>  	} else {
>>>> @@ -556,13 +563,15 @@ int perf_mmap__read_init(struct perf_mmap *map)
>>>>  }
>>>>  
>>>>  int perf_mmap__push(struct perf_mmap *md, void *to,
>>>> -		    int push(struct perf_mmap *map, void *to, void *buf, size_t size))
>>>> +		    int push(struct perf_mmap *map, void *to, void *buf, size_t size),
>>>> +		    perf_mmap__compress_fn_t compress, void *comp_data)
>>>>  {
>>>>  	u64 head = perf_mmap__read_head(md);
>>>>  	unsigned char *data = md->base + page_size;
>>>>  	unsigned long size;
>>>>  	void *buf;
>>>>  	int rc = 0;
>>>> +	size_t mmap_len = perf_mmap__mmap_len(md);
>>>>  
>>>>  	rc = perf_mmap__read_init(md);
>>>>  	if (rc < 0)
>>>> @@ -574,7 +583,10 @@ int perf_mmap__push(struct perf_mmap *md, void *to,
>>>>  		buf = &data[md->start & md->mask];
>>>>  		size = md->mask + 1 - (md->start & md->mask);
>>>>  		md->start += size;
>>>> -
>>>> +		if (compress) {
>>>> +			size = compress(comp_data, md->data, mmap_len, buf, size);
>>>> +			buf = md->data;
>>>> +		}
>>>>  		if (push(md, to, buf, size) < 0) {
>>>>  			rc = -1;
>>>>  			goto out;
>>>
>>> when we discussed the compress callback should be another layer
>>> in perf_mmap__push I was thinking more of the layered/fifo design,
>>> like:
>>>
>>> normaly we call:
>>>
>>> 	perf_mmap__push(... push = record__pushfn ...)
>>> 		-> reads mmap data and calls push(data), which translates as:
>>>
>>> 		record__pushfn(data);
>>> 			- which stores the data
>>>
>>>
>>> for compressed it'd be:
>>>
>>> 	perf_mmap__push(... push = compressed_push ...)
>>>
>>> 		-> reads mmap data and calls push(data), which translates as:
>>>
>>> 		compressed_push(data)
>>> 			-> reads data, compresses them and calls, next push callback in line:
>>>
>>> 			record__pushfn(data)
>>> 				- which stores the data
>>>
>>>
>>> there'd need to be the logic for compressed_push to
>>> remember the 'next push' function
>>
>> That is suboptimal for AIO. Also compression is an independent operation that 
>> could be applied on any of push stages you mean.
> 
> not sure what you mean by suboptimal, but I think
> that it can still happen in subsequent push callback
> 
>>
>>>
>>> but I think this was the original idea behind the
>>> perf_mmap__push -> it gets the data and pushes them for
>>> the next processing.. it should stay as simple as that
>>
>> Agree on keeping simplicity and, at the moment, there is no any push to the next 
>> processing in the code so provided implementation fits as for serial as for AIO
>> at the same time sticking to simplicity as much as possibly. If you see something 
>> that would fit better please speak up and share.
> 
> I have to insist that perf_mmap__push stays untouched
> and we do other processing in the push callbacks

What is about perf_mmap__aio_push()?

Without compression it does 
	memcpy(), memcpy(), aio_push()

With compression its does
	memcpy_with_compression(), memcpy_with_compression(), aio_push()

and deviation that increases amount of copy operations i.e. implementing three or more 
is suboptimal in terms of runtime overhead and data loss decrease

Compression for serial streaming can be implemented in push() callback.
AIO case would go with compression over a parameter in aio_push().
So the both trace writing schemas could be optimally extended.

~Alexey

> 
> jirka
>