lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 8 Aug 2016 19:34:54 +0800
From:	"Wangnan (F)" <wangnan0@...wei.com>
To:	Arnaldo Carvalho de Melo <acme@...nel.org>
CC:	Namhyung Kim <namhyung@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] perf: memory leak in perf top



On 2016/8/5 22:12, Arnaldo Carvalho de Melo wrote:
> Em Fri, Aug 05, 2016 at 06:58:12PM +0800, Wangnan (F) escreveu:
>> perf top comsumes too much memory than it need.
>>
>> Using following commands:
>>
>>   # yes '' > /dev/null
>>   # perf top -e raw_syscalls:sys_enter
>>
>> System will quickly become unresponsive because of out of memory.
>>
>> Using GDB I find two call stacks of malloc. See below.
>>
>> [call stack 1]
>>
>> #0 __GI___libc_malloc (bytes=140723951319552) at malloc.c:2841
>> #1 0x00000000004b7246 in memdup (src=0x7f515e04b054, len=68) at
>> ../lib/string.c:29
>> #2 0x00000000004f9299 in hist_entry__init (sample_self=true,
>> template=0x7ffcd9213a80, he=0xa37da50) at util/hist.c:401
>> #3 hist_entry__new (template=template@...ry=0x7ffcd9213a80
>> <mailto:%28template=template@...ry=0x7ffcd9213a80>,
>> sample_self=sample_self@...ry=true)
>> <mailto:sample_self=sample_self@...ry=true%29> at util/hist.c:458
>> #4 0x00000000004fa13f in hists__findnew_entry (hists=0x3549d30,
>> hists=0x3549d30, al=0x7ffcd9213ca0, sample_self=true, entry=0x
>> 7ffcd9213a80) at util/hist.c:544
>> #5 __hists__add_entry (hists=0x3549d30, al=0x7ffcd9213ca0,
>> sym_parent=<optimized out>, bi=bi@...ry=0x0 <mailto:bi=bi@...ry=0x0>,
>> mi=mi@...ry=0x0 <mailto:mi=mi@...ry=0x0>, sampl
>> e=<optimized out>, sample_self=sample_self@...ry=true
>> <mailto:sample_self=sample_self@...ry=true>, ops=ops@...ry=0x0)
>> <mailto:ops=ops@...ry=0x0%29> at util/hist.c:600
>> #6 0x00000000004fa3d6 in hists__add_entry (sample_self=true,
>> sample=<optimized out>, mi=0x0, bi=0x0, sym_parent=<optimized out
>>> , al=<optimized out>, hists=<optimized out>) at util/hist.c:611
>> #7 iter_add_single_normal_entry (iter=0x7ffcd9213ce0, al=<optimized out>) at
>> util/hist.c:823
>> #8 0x00000000004fc3c5 in hist_entry_iter__add (iter=0x7ffcd9213ce0,
>> al=0x7ffcd9213ca0, max_stack_depth=<optimized out>, arg=0x
>> 7ffcd9214120) at util/hist.c:1030
>> #9 0x00000000004536d7 in perf_event__process_sample (machine=<optimized
>> out>, sample=0x7ffcd9213d30, evsel=0x3549bb0, event=<o
>> ptimized out>, tool=0x7ffcd9214120) at builtin-top.c:800
>> #10 perf_top__mmap_read_idx (top=top@...ry=0x7ffcd9214120
>> <mailto:%28top=top@...ry=0x7ffcd9214120>, idx=idx@...ry=0)
>> <mailto:idx=idx@...ry=0%29> at builtin-top.c:866
>> #11 0x00000000004559fd in perf_top__mmap_read (top=0x7ffcd9214120) at
>> builtin-top.c:883
>> #12 __cmd_top (top=0x7ffcd9214120) at builtin-top.c:1018
>> #13 cmd_top (argc=<optimized out>, argv=<optimized out>, prefix=<optimized
>> out>) at builtin-top.c:1344
>> #14 0x0000000000493561 in run_builtin (p=p@...ry=0x948000
>> <mailto:%28p=p@...ry=0x948000> <commands+288>, argc=argc@...ry=4
>> <mailto:argc=argc@...ry=4>, argv=argv@...ry=0x7ffcd92199b0)
>> <mailto:argv=argv@...ry=0x7ffcd92199b0%29> at
>> perf.c:357
>> #15 0x0000000000437157 in handle_internal_command (argv=0x7ffcd92199b0,
>> argc=4) at perf.c:419
>> #16 run_argv (argv=0x7ffcd9219720, argcp=0x7ffcd921972c) at perf.c:465
>> #17 main (argc=4, argv=0x7ffcd92199b0) at perf.c:609
>>
>> [call stack 2]
>   
>> #0 __GI___libc_malloc (bytes=140723951319456) at malloc.c:2841
>> #1 0x00000000005766de in trace_seq_init (s=0x7ffcd92139b0) at trace-seq.c:60
>> #2 0x00000000004f7047 in get_trace_output (he=0x7ffcd9213a80) at
>> util/sort.c:584
> Yeah, this allocates 4K, way too much, can you try with this patch,
> which amounts to a trace_seq_trim()?
>
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 947d21f38398..37ddcf57500f 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -588,7 +588,7 @@ static char *get_trace_output(struct hist_entry *he)
>   	} else {
>   		pevent_event_info(&seq, evsel->tp_format, &rec);
>   	}
> -	return seq.buffer;
> +	return realloc(seq.buffer, seq.len);
>   }
>   
>   static int64_t

It works: from 1.8g/10s to 40m/10s, but unfortunately your
patch is wrong: please use realloc(seq.buffer, seq.len + 1)
because of the ending '\0' :)

In my experiment RSS continuously increases because I use
raw_syscalls, which can result in too much entries to record
in perf top because of the nature of argument list (garbage
arguments are also considered by perf). It becomes much better
if I use --fields to reduce the number of possible entries.
However, theoretically perf will eventually consume all
system memory, unless we carefully choose tracepoints and
fields to make sure the possible entries is limited.

Thank you.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ