lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 6 Sep 2018 14:50:56 +0300
From:   Alexey Budankov <alexey.budankov@...ux.intel.com>
To:     Jiri Olsa <jolsa@...hat.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Andi Kleen <ak@...ux.intel.com>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v7 1/2]: perf util: map data buffer for preserving
 collected data



On 06.09.2018 14:04, Jiri Olsa wrote:
> On Wed, Sep 05, 2018 at 10:19:56AM +0300, Alexey Budankov wrote:
>>
>> The map->data buffers are used to preserve map->base profiling data 
>> for writing to disk. AIO map->cblocks are used to queue corresponding 
>> map->data buffers for asynchronous writing. map->cblocks objects are 
>> located in the last page of every map->data buffer.
>>
>> Signed-off-by: Alexey Budankov <alexey.budankov@...ux.intel.com>
>> ---
>>  Changes in v7:
>>   - implemented handling record.aio setting from perfconfig file
>>  Changes in v6:
>>   - adjusted setting of priorities for cblocks;
>>  Changes in v5:
>>   - reshaped layout of data structures;
>>   - implemented --aio option;
>>  Changes in v4:
>>   - converted mmap()/munmap() to malloc()/free() for mmap->data buffer management 
>>  Changes in v2:
>>   - converted zalloc() to calloc() for allocation of mmap_aio array,
>>   - cleared typo and adjusted fallback branch code;
>> ---
>>  tools/perf/builtin-record.c | 15 ++++++++++++-
>>  tools/perf/perf.h           |  1 +
>>  tools/perf/util/evlist.c    |  7 +++---
>>  tools/perf/util/evlist.h    |  3 ++-
>>  tools/perf/util/mmap.c      | 53 +++++++++++++++++++++++++++++++++++++++++++++
>>  tools/perf/util/mmap.h      |  6 ++++-
>>  6 files changed, 79 insertions(+), 6 deletions(-)
>>
>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>> index 22ebeb92ac51..f17a6f9cb1ba 100644
>> --- a/tools/perf/builtin-record.c
>> +++ b/tools/perf/builtin-record.c
>> @@ -326,7 +326,8 @@ static int record__mmap_evlist(struct record *rec,
>>  
>>  	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages,
>>  				 opts->auxtrace_mmap_pages,
>> -				 opts->auxtrace_snapshot_mode) < 0) {
>> +				 opts->auxtrace_snapshot_mode,
>> +				 opts->nr_cblocks) < 0) {
>>  		if (errno == EPERM) {
>>  			pr_err("Permission error mapping pages.\n"
>>  			       "Consider increasing "
>> @@ -1287,6 +1288,8 @@ static int perf_record_config(const char *var, const char *value, void *cb)
>>  		var = "call-graph.record-mode";
>>  		return perf_default_config(var, value, cb);
>>  	}
>> +	if (!strcmp(var, "record.aio"))
>> +		rec->opts.nr_cblocks = strtol(value, NULL, 0);
>>  
>>  	return 0;
>>  }
>> @@ -1519,6 +1522,7 @@ static struct record record = {
>>  			.default_per_cpu = true,
>>  		},
>>  		.proc_map_timeout     = 500,
>> +		.nr_cblocks	      = 2
>>  	},
>>  	.tool = {
>>  		.sample		= process_sample_event,
>> @@ -1678,6 +1682,8 @@ static struct option __record_options[] = {
>>  			  "signal"),
>>  	OPT_BOOLEAN(0, "dry-run", &dry_run,
>>  		    "Parse options then exit"),
>> +	OPT_INTEGER(0, "aio", &record.opts.nr_cblocks,
>> +		    "asynchronous trace write operations (min: 1, max: 32, default: 2)"),
> 
> ok, so this got silently added in recent versions and I couldn't
> find any justification for it.. why do we use more aio blocks for
> single map now? also why the default is 2?

Having more blocks may improve thruput from kernel to userspace for 
cases when we get more data at map->base but the started AIO is not 
finished yet. That can easily happen between calls of 
record__mmap_read_evlist().

> 
> the option should be more specific like 'aio-blocks'

ok.

> 
> the change is difficult enough.. we should start simple and add
> these additions with proper justification in separate patches

Setting default to 1 gives the simplest solution. I could provide 
justification where spinning at record__aio_sync() becomes the hotspot.

> 
> thanks,
> jirka
> 

Powered by blists - more mailing lists