[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87r4b04n2t.fsf@sejong.aot.lge.com>
Date: Fri, 01 Nov 2013 16:07:22 +0900
From: Namhyung Kim <namhyung@...nel.org>
To: Rodrigo Campos <rodrigo@...g.com.ar>
Cc: Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Paul Mackerras <paulus@...ba.org>,
Ingo Molnar <mingo@...nel.org>,
Namhyung Kim <namhyung.kim@....com>,
LKML <linux-kernel@...r.kernel.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Stephane Eranian <eranian@...gle.com>,
Jiri Olsa <jolsa@...hat.com>, Arun Sharma <asharma@...com>
Subject: Re: [PATCH 08/14] perf report: Cache cumulative callchains
Hi Rodrigo,
On Thu, 31 Oct 2013 11:13:34 +0000, Rodrigo Campos wrote:
> On Thu, Oct 31, 2013 at 03:56:10PM +0900, Namhyung Kim wrote:
>> From: Namhyung Kim <namhyung.kim@....com>
>> /*
>> + * This is for detecting cycles or recursions so that they're
>> + * cumulated only one time to prevent entries more than 100%
>> + * overhead.
>> + */
>> + ccache = malloc(sizeof(*ccache) * PERF_MAX_STACK_DEPTH);
>> + if (ccache == NULL)
>> + return -ENOMEM;
>> +
>> + node = callchain_cursor_current(&callchain_cursor);
>> + if (node == NULL)
>> + return 0;
>
> Here you return without assigning iter->priv nor iter->priv->dso iter->priv->sym
Right! I forgot to set iter->priv to ccache in this case.
>
>> +
>> + ccache[0].dso = node->map->dso;
>> + ccache[0].sym = node->sym;
>> +
>> + iter->priv = ccache;
>> + iter->curr = 1;
>
> Because the assignment is done here.
>
>> +
>> + /*
>> * The first callchain node always contains same information
>> * as a hist entry itself. So skip it in order to prevent
>> * double accounting.
>> @@ -501,8 +528,29 @@ iter_add_next_cumulative_entry(struct add_entry_iter *iter,
>> {
>> struct perf_evsel *evsel = iter->evsel;
>> struct perf_sample *sample = iter->sample;
>> + struct cumulative_cache *ccache = iter->priv;
>> struct hist_entry *he;
>> int err = 0;
>> + int i;
>> +
>> + /*
>> + * Check if there's duplicate entries in the callchain.
>> + * It's possible that it has cycles or recursive calls.
>> + */
>> + for (i = 0; i < iter->curr; i++) {
>> + if (sort__has_sym) {
>> + if (ccache[i].sym == al->sym)
>> + return 0;
>> + } else {
>> + /* Not much we can do - just compare the dso. */
>> + if (ccache[i].dso == al->map->dso)
>
> sym and dso are used here
>
>> + return 0;
>> + }
>> + }
>> +
>> + ccache[i].dso = al->map->dso;
>> + ccache[i].sym = al->sym;
>> + iter->curr++;
>>
>> he = __hists__add_entry(&evsel->hists, al, iter->parent, NULL, NULL,
>> sample->period, sample->weight,
>> @@ -538,6 +586,7 @@ iter_finish_cumulative_entry(struct add_entry_iter *iter,
>> evsel->hists.stats.total_period += sample->period;
>> hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
>>
>> + free(iter->priv);
>
> And here I'm seeing a double free when trying the patchset with other examples.
> I added a printf to the "if (node == NULL)" case and I'm hitting it. So it seems
> to me that, when reusing the entry, every user is freeing it and then the double
> free.
>
> This is my first time looking at perf code, so I might be missing LOT of things,
> sorry in advance :)
Don't say sorry! You're very helpful and found a real bug!
>
> I tried copying the dso and sym to the new allocated mem (and assigning
> iter->priv = ccache before the return if "node == NULL"), as shown in the
> attached patch, but when running with valgrind it also added some invalid reads
> and segfaults (without valgrind it didn't segfault, but I must be "lucky").
>
> So if there is no node (node == NULL) and we cannot read the dso and sym from
> the current values of iter->priv (they show invalid reads in valgrind), I'm not
> sure where can we read them. And, IIUC, we should initialize them because they
> are used later. So maybe there are only some cases where we can read iter->priv
> and for the other cases just initialize to something (although doesn't feel
> possible because it's the dso and sym) ? Or should we read/copy them from some
> other place (maybe before some other thing is free'd) ? Or maybe forget about
> the malloc when node == NULL and just use iter->priv and the free shouldn't be
> executed till iter->curr == 1 ? I added that if for the free, but didn't help.
> Although I didn't really check how iter->curr is used. What am I missing ?
If node == NULL, it means there no valid callchains so no need to go in
the loop - iter_next_cumulative_entry() returns 0 so iter_add_next_
cumulative_entry() never called. So don't worry about the sym and dso
in this case.
The problem is for freeing iter->priv unconditionally. Since it has
previous ccache pointer (which already freed) it can lead to a double
free if the next entry has no valid callchains.
>
> I'm not really sure which is the fix for this. Also just in case I tried
> assigning "iter->priv = NULL" after it's free'd and it """fixes""" it.
I think the right fix is assigning "iter->priv = NULL" as you said. But
I changed this patch a bit for v3 so need to check it again.
>
> Just reverting the patch (reverts without conflict) also solves the double free
> problem for me (although it probably introduces the problem the patch tries to
> fix =) and seems to make valgrind happy too.
>
> Thanks a lot and sorry again if I'm completely missing some "rules/invariants",
> I'm really new to perf :)
You didn't miss anything and I'd really appreciate your review. :)
Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists