lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210630173400.7963f619@oasis.local.home>
Date:   Wed, 30 Jun 2021 17:34:00 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Paul Burton <paulburton@...gle.com>
Cc:     linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
        Joel Fernandes <joelaf@...gle.com>, stable@...r.kernel.org
Subject: Re: [PATCH 2/2] tracing: Resize tgid_map to PID_MAX_LIMIT, not
 PID_MAX_DEFAULT

On Wed, 30 Jun 2021 14:09:42 -0700
Paul Burton <paulburton@...gle.com> wrote:

> Hi Steven,
> 
> On Wed, Jun 30, 2021 at 08:35:13AM -0400, Steven Rostedt wrote:
> > On Tue, 29 Jun 2021 17:34:06 -0700
> > Paul Burton <paulburton@...gle.com> wrote:
> >   
> > > On 64 bit systems this will increase the size of tgid_map from 256KiB to
> > > 16MiB. Whilst this 64x increase in memory overhead sounds significant 64
> > > bit systems are presumably best placed to accommodate it, and since
> > > tgid_map is only allocated when the record-tgid option is actually used
> > > presumably the user would rather it spends sufficient memory to actually
> > > record the tgids they expect.  
> > 
> > NAK. Please see how I fixed this for the saved_cmdlines, and implement
> > it the same way.
> > 
> > 785e3c0a3a87 ("tracing: Map all PIDs to command lines")
> > 
> > It's a cache, it doesn't need to save everything.  
> 
> Well sure, but it's a cache that (modulo pid recycling) previously had a
> 100% hit rate for tasks observed in sched_switch events.

Obviously it wasn't 100% when it went over the PID_MAX_DEFAULT.

> 
> It differs from saved_cmdlines in a few key ways that led me to treat it
> differently:
> 
> 1) The cost of allocating map_pid_to_cmdline is paid by all users of
>    ftrace, whilst as I mentioned in my commit description the cost of
>    allocating tgid_map is only paid by those who actually enable the
>    record-tgid option.

I'll admit that this is a valid point.

> 
> 2) We verify that the data in map_pid_to_cmdline is valid by
>    cross-referencing it against map_cmdline_to_pid before reporting it.
>    We don't currently have an equivalent for tgid_map, so we'd need to
>    add a second array or make tgid_map an array of struct { int pid; int
>    tgid; } to avoid reporting incorrect tgids. We therefore need to
>    double the memory we consume or further reduce the effectiveness of
>    this cache.

Double 256K is just 512K which is still much less than 16M.

> 
> 3) As mentioned before, with the default pid_max tgid_map/record-tgid
>    has a 100% hit rate which was never the case for saved_cmdlines. If
>    we go with a solution that changes this property then I certainly
>    think the docs need updating - the description of saved_tgids in
>    Documentation/trace/ftrace.rst makes no mention of this being
>    anything but a perfect recreation of pid->tgid relationships, and
>    unlike the description of saved_cmdlines it doesn't use the word
>    "cache" at all.
> 
> Having said that I think taking a similar approach to saved_cmdlines
> would be better than what we have now, though I'm not sure whether it'll
> be sufficient to actually be usable for me. My use case is grouping
> threads into processes when displaying scheduling information, and
> experience tells me that if any threads don't get grouped appropriately
> the result will be questions.

I found a few bugs recently in the saved_cmdlines that were causing a
much higher miss rate, but now it's been rather accurate. I wonder how
much pain that's been causing you?

Anyway, I'll wait to hear what Joel says on this. If he thinks this is
worth 16M of memory when enabled, I may take it.

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ