[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160419131947.3c5208b4@gandalf.local.home>
Date: Tue, 19 Apr 2016 13:19:47 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Jiri Olsa <jolsa@...nel.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
linux-trace-users@...r.kernel.org
Subject: Re: [RFC][PATCH 2/4] tracing: Use pid bitmap instead of a pid array
for set_event_pid
On Tue, 19 Apr 2016 16:55:28 +0000 (UTC)
Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
> ----- On Apr 19, 2016, at 10:34 AM, rostedt rostedt@...dmis.org wrote:
>
> > From: Steven Rostedt <rostedt@...dmis.org>
> >
> > In order to add the ability to let tasks that are filtered by the events
> > have their children also be traced on fork (and then not traced on exit),
> > convert the array into a pid bitmask. Most of the time the number of pids is
> > only 32768 pids or a 4k bitmask, which is the same size as the default list
> > currently is, and that list could grow if more pids are listed.
> >
> > This also greatly simplifies the code.
>
> The maximum PID number can be increased with sysctl.
>
> See "pid_max" in Documentation/sysctl/kernel.txt
>
> What happens when you have a very large pid_max set ?
I discussed this with HPA, and it appears that the pid_max max would
require a bitmap of about 1/2 meg (the current default is 8k). This is
also why I chose to keep the bitmap as vmalloc and not a continuous
page allocation.
>
> You say "most of the time" as if this was a fast-path vs a slow-path,
> but it is not the case here.
I meant "most of the time" as "default". Yes, you can make the pid_max
really big, but in that case you better have enough memory in your
system to handle that many threads. Thus a 1/2 meg used for tracking
pids shouldn't be an issue.
>
> This is a configuration option that can significantly hurt memory usage
> in configurations using a large pid_max.
No, it is created dynamically. If you never write anything into the
set_event_pid file, then you have nothing to worry about, as nothing
is allocated. It creates the array when a pid is added to the file, and
only then. If it fails to allocate, the write will return -ENOMEM as the
errno.
Again, if you have a large pid_max your box had better have a lot of
memory to begin with, because this array will be negligible compared to
the memory required to handle large number of tasks.
>
> FWIW, I implement a similar feature with a hash table in lttng-modules.
> I don't have the child process tracking though, which is a neat improvement.
I originally had a complex hash algorithm because I too was worried
about the size of pid_max and using a bitmap, but HPA convinced me it
was the way to go.
-- Steve
Powered by blists - more mailing lists