[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220204235737.657675e6e468c367c13fc1b2@kernel.org>
Date: Fri, 4 Feb 2022 23:57:37 +0900
From: Masami Hiramatsu <mhiramat@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Darren Hart <dvhart@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>,
André Almeida <andrealmeid@...labora.com>,
James Clark <james.clark@....com>,
John Garry <john.garry@...wei.com>,
Riccardo Mancini <rickyman7@...il.com>,
Yury Norov <yury.norov@...il.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Jin Yao <yao.jin@...ux.intel.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Leo Yan <leo.yan@...aro.org>, Andi Kleen <ak@...ux.intel.com>,
Thomas Richter <tmricht@...ux.ibm.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Madhavan Srinivasan <maddy@...ux.ibm.com>,
Shunsuke Nakamura <nakamura.shun@...itsu.com>,
Song Liu <song@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Miaoqian Lin <linmq006@...il.com>,
Stephen Brennan <stephen.s.brennan@...cle.com>,
Kajol Jain <kjain@...ux.ibm.com>,
Alexey Bayduraev <alexey.v.bayduraev@...ux.intel.com>,
German Gomez <german.gomez@....com>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
Eric Dumazet <edumazet@...gle.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
masami.hiramatsu.pt@...achi.com, eranian@...gle.com
Subject: Re: [PATCH v2 0/4] Reference count checker and related fixes
On Sun, 30 Jan 2022 09:40:21 -0800
Ian Rogers <irogers@...gle.com> wrote:
> > > > Hi Ian,
> > > >
> > > > Hmm, but such a macro is not usual for C which perf is written in.
> > > > If I understand correctly, you might want to use memory leak
> > > > analyzer to detect refcount leak, and that analyzer will show
> > > > what data structure is leaked.
> > >
> > > Firstly, thanks for the conversation - this is really useful to
> > > improve the code!
> >
> > Hi Ian,
> >
> > You're welcome! This conversation also useful to me to understand
> > the issue deeper :-)
> >
> > > I think in an ideal world we'd somehow educate things like address
> > > sanitizer of reference counted data structures and they would do a
> > > better job of tracking gets and puts. The problem is pairing gets and
> > > puts.
> >
> > Agreed. From my experience, pairing gets and puts are hard without
> > reviewing the source code, since the refcounter is not only used in a
> > single function, but it is for keeping the object not released until
> > some long process is finished.
> >
> > For example, if the correct pair is like below, funcA-funcD, funcB-funcC,
> > funcA (get)
> > funcB (get)
> > funcC (put)
> > funcD (put)
> >
> > But depending on the execution timing, funcC/funcD can be swapped.
> > funcA (get)
> > funcB (get)
> > funcD (put)
> > funcC (put)
> >
> > And if there is a bug, funcX may get the object by mistake.
> > funcA (get)
> > funcX (get)
> > funcB (get)
> > funcD (put)
> > funcC (put)
> >
> > Or, funcC() might miss to put.
> > funcA (get)
> > funcB (get)
> > funcD (put)
> >
> > In any case, just tracking the get/put, it is hard to know which pair
> > is not right. I saw these patterns when I debugged it. :(
>
> Yep, I've found this issue too :-) The get is being used for the
> side-effect of incrementing a reference count rather than for
> returning the value. This happened in cpumap merge and was easy to fix
> there.
>
> This problem is possible in general, but I think if it were common we
> are doomed. I don't think this pattern is common though. In general a
> reference count is owned by something, the scope of a function or the
> lifetime of a list. If puts were adhoc then it would mean that one
> occurring in a function could be doing it for the side effect of
> freeing on a list. I don't think the code aims to do that. Making the
> code clean with pairing gets and puts is an issue, like with the
> cpumap merge change.
Hi Ian,
Sorry for waiting.
I got the pairing of get/put is not so hard if we use your
programing pattern. The problem was the posession of the object.
As you proposed, if we force users to use the returning "token"
instead of object pointer itself as below;
funcA(obj) {
token = get(obj);
get_obj(token)->...
put(token);
}
Then it is clear who leaks the token.
funcA (get-> token1)
funcX (get-> token3)
funcB (get-> token2)
funcD (put-> token2)
funcC (put-> token1)
In this case token3 is not freed, thus the funcX's pair is lost.
Or,
funcA (get-> token1)
funcB (get-> token2)
funcD (put-> token2)
In this case funcA's pair is lost.
And if the user access object with the token which already put,
it can be detected.
>
> > > In C++ you use RAII types so that the destructor ensures a put -
> > > this can be complex when using data types like lists where you want to
> > > move or swap things onto the list, to keep the single pointer
> > > property. In the C code in Linux we use gotos, similarly to how defer
> > > is used in Go. Anyway, the ref_tracker that Eric Dumazet added solved
> > > the get/put pairing problem by adding a cookie that is passed around.
> >
> > Cool! I will check the ref_tracker :)
> >
> > > The problem with that is that then the cookie becomes part of the API.
> >
> > What the cookie is? some pairing function address??
>
> As I understand it, a token to identify a get.
Yeah, I slightly missed that your API will force caller to use the
returning token instead of object.
So, what about using token always, instead of wrapping the object
pointer only when debugging?
I felt uncomfortable about changing the data structure name according
to the debug macro. Instead, it is better way for me if get() always
returns a token of the object and users need to convert the
token to the object. For example;
struct perf_cpu_map {
...
};
#ifdef REFCNT_CHECKING
typedef struct {struct perf_cpu_map *orig;} perf_cpu_map_token_t;
#else
typedef unsigned long perf_cpu_map_token_t; /* actually this is "struct perf_cpu_map *" */
#endif
perf_cpu_map_token_t perf_cpu_map__get(struct perf_cpu_map *map);
void perf_cpu_map__put(struct perf_cpu_map_token_t tok);
This explicitly forces users to convert the token to the object
when using it. Of course if a user uses the object pointer ("map" here)
directly, the token is not used. But we can check such wrong usage by
compilation.
[...]
> > So my question is that we need to pay the cost to use UNWRAP_ macro
> > on all those object just for finding the "use-after-put" case.
> > Usually the case that "use-after-put" causes actual problem is very
> > limited, or it can be "use-after-free".
>
> So the dso->nsinfo case was a use after put once we added in the
> missing put - it could also be thought of as a double put/free. In
> general use-after-put is going to show where a strict get-then-put
> isn't being followed, if we make sure of that property then the
> reference counting will be accurate.
The double free/put will be detected different way. But indeed the
use-after-put can be overlooked (I think there actually be the
case, it should be "use-after-free" but it depends on the timing.)
>
> A case that came up previously was maps__find:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.c#n974
> this code retuns a map but without doing a get on it, even though a
> map is reference counted:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.h#n46
> If we change that code to return a get of the map then we add overhead
> for simple cases of checking a map is present - you can infer you have
> a reference count on the map if you have it on the maps. The
> indirection allows the code to remain as-is, while being able to catch
> misuse.
I don't think using the UNWRAP_* macro is "remain as-is" ;) but I agree
with you that can catch the misuse.
IMHO, I rather like using the explicit token. I don't like to see
"UNWRAP_map(map)->field", but "GET_map(token)->field" is good for me.
This is because the "map" should be a pointer of data structure (so
its field can be accessed without any wrapper), but token is just a
value (so this implies that it must be converted always).
In other words, "map->field" looks natural for reviewing, but
"token->field" looks obviously strange.
Thank you,
--
Masami Hiramatsu <mhiramat@...nel.org>
Powered by blists - more mailing lists